Zaɓi Harshe

Tsarin Haske Mai Mu'amala: Hanyar Mai Amfani Don Sarrafa Hotuna Mai Ƙarfi

Nazarin tsarin mu'amala mai sauƙi don samar da hotuna masu zaman kansu daga haske, tare da magance gazawar hanyoyin atomatik a fuskokin da ba su da layi da sarƙaƙiya.
rgbcw.net | PDF Size: 1.4 MB
Kima: 4.5/5
Kimarku
Kun riga kun ƙididdige wannan takarda
Murfin Takardar PDF - Tsarin Haske Mai Mu'amala: Hanyar Mai Amfani Don Sarrafa Hotuna Mai Ƙarfi

Tsarin Abubuwan Ciki

1. Gabatarwa & Bayyani

Bambance-bambancen haske, musamman inuwa, suna gabatar da ƙalubale masu mahimmanci ga algorithms na hangen nesa na kwamfuta, suna shafar ayyuka daga raba hoto zuwa gane abu. Hanyoyin atomatik na gargajiya don samun hotuna masu zaman kansu daga haske sau da yawa suna fuskantar wahala tare da hotuna da ba a yi su ba bisa layi (misali, JPEGs daga kyamarorin masu amfani) da fuskoki masu sarƙaƙiya inda canje-canjen haske ke da wahalar ƙirƙira atomatik. Wannan takarda ta Gong da Finlayson ta gabatar da tsarin mu'amala, mai amfani ke jagorantar shi wanda ke ba masu amfani damar tantance nau'in bambancin haske da za a cire, don haka ƙarfafa ƙarfi da amfani.

Babban jigon shine a wuce gaba da cikakkun hanyoyin atomatik, guda ɗaya mai dacewa da kowa. Ta hanyar haɗa shigarwar mai amfani mai sauƙi—wani zane da ke ayyana yanki da wani takamaiman canjin haske ya shafa—tsarin zai iya daidaita tsarin samun hoton da bai dogara da haske ba, wanda zai haifar da sakamako mafi daidai ga hotunan duniya masu ƙalubale.

Mahimman Bayanai

  • Sauƙi na Mai Amfani a cikin Tsarin: Yana magance iyakokin hanyoyin atomatik ta hanyar amfani da ƙaramin shigarwar mai amfani don jagora.
  • Ƙarfi ga Rashin Layi: An ƙera shi musamman don sarrafa gyaran gamma, daidaiton sauti, da sauran nau'ikan hotuna marasa layi da aka saba da su a cikin daukar hoto.
  • Cire Takamaiman Haske: Yana ba da damar cire takamaiman abubuwan haske (misali, wata inuwa ta musamman) ba tare da shafar hasken duniya ko rubutu ba.

2. Tsarin Aiki na Asali

Hanyar tana haɗa tazarar tsakanin cikakken rarrabuwar hoto na ciki da kayan aikin gyara hoto masu amfani, masu mayar da hankali kan mai amfani.

2.1 Tsarin Shigar da Mai Amfani

Tsarin yana buƙatar zane guda ɗaya kawai daga mai amfani. Wannan zanen ya kamata ya rufe yanki inda bambance-bambancen ƙarfin pixel galibi sakamakon tasirin haske ne wanda mai amfani yake son cirewa (misali, inuwar penumbra). Wannan shigarwar tana ba da mahimmin alama ga algorithm don ware vector haske a cikin sararin launi.

Fa'ida: Wannan yana da ƙarancin aiki sosai fiye da buƙatar daidaitaccen matting ko cikakken rabe-rabe, yana mai da shi mai amfani ga masu amfani na yau da kullun da ƙwararru iri ɗaya.

2.2 Samun Hoton da Bai Dogara da Haske ba

Bisa samfurin haske na ilimin kimiyyar lissafi, hanyar tana aiki a cikin sararin log-chrominance. Zanen mai amfani yana ayyana saitin pixels da ake ɗauka sun fito ne daga saman guda ɗaya ƙarƙashin haske daban-daban. Daga nan algorithm tana ƙididdige alkiblar canjin haske a cikin wannan ƙaramin sarari kuma tana lissafin tsinkaya mai kusurwa zuwa wannan alkibla don samun ɓangaren da bai dogara da haske ba.

Ana iya taƙaita tsarin kamar haka: Hoton Shigarwa → Canji zuwa Log RGB → Jagorar Zanen Mai Amfani → Ƙididdigar Alkiblar Haske → Tsinkaya Mai Kusurwa → Fitowar da Bai Dogara da Haske ba.

3. Tsarin Fasaha

3.1 Tushen Lissafi

Hanyar tana da tushe a cikin samfurin nunin dichromatic da kuma lura da cewa, ga yawancin hasken yanayi, canjin haske yana dace da motsi tare da takamaiman alkibla a cikin sararin log RGB. Ga pixel I ƙarƙashin haske mai kama da Planckian, ƙimar log-chrominance dinsa suna kan layi. Abubuwa daban-daban suna samar da layuka masu kama da juna. Hoton da bai dogara da haske ba I_inv ana samunsa ta hanyar tsinkaya log-hoton zuwa alkibla mai kusurwa zuwa vector canjin haske da aka ƙiyasta u.

Babban Tsari: Tsinkaya don vector log-chrominance na pixel χ ana bayar da shi ta: $$ I_{\text{inv}} = \chi - (\chi \cdot \hat{u}) \hat{u} $$ inda \hat{u} shine vector naúrar a cikin alkiblar haske da aka ƙiyasta. Zanen mai amfani yana ba da bayanai don ƙiyasta u da ƙarfi, musamman a cikin hotuna marasa layi inda rage yawan entropy na duniya (kamar yadda a cikin aikin da ya gabata na Finlayson et al.) ya gaza.

3.2 Tsarin Aikin Algorithm

  1. Gyara Kafin Aiki: Canza hoton shigarwa zuwa sararin log RGB.
  2. Mu'amala da Mai Amfani: Sami shigarwar zane a kan yankin bambancin haske da aka yi niyya.
  3. Ƙididdiga na Gida: Lissafta babban alkiblar bambanci (alkiblar haske u) daga pixels ƙarƙashin zanen.
  4. Aikace-aikacen Duniya: Aiwatar da tsinkaya mai kusurwa zuwa u a ko'ina cikin hoton don samar da sigar da bai dogara da haske ba.
  5. Gyara Bayan Aiki: Zaɓi na taswirar tashar da bai dogara da haske ba zuwa hoton launin toka ko launin ƙarya mai iya gani.

4. Sakamakon Gwaji & Kimantawa

Takardar ta gabatar da kimantawa da ke nuna ingancin tsarin.

4.1 Ma'aunin Aiki

An gudanar da kimantawa na inganci da ƙididdiga. Hanyar ta yi nasarar cire inuwowi da aka yi niyya da matakan haske yayin kiyaye rubutun saman da gefuna na kayan. Tana nuna ƙarfi na musamman a cikin sarrafa:

  • Inuwowi masu Laushi & Penumbras: Yankuna inda iyakokin inuwa suka watse kuma suna da wahalar gano su atomatik.
  • Hotuna marasa Layi: Hotunan sRGB na yau da kullun inda abubuwan da ba su dogara da haske ba bisa ƙa'idodin kimiyyar lissafi suka rushe.
  • Fuskoki masu Sarƙaƙi: Fuskoki masu kayan aiki da yawa da sake nunawa, inda ƙididdigar hasken duniya ke da hayaniya.

4.2 Nazarin Kwatance

Idan aka kwatanta da cikakkun hanyoyin rarrabuwar hoto na ciki na atomatik (misali, Bell et al., 2014) da dabarun cire inuwa, hanyar mu'amala tana ba da sakamako mafi girma a cikin ayyukan da mai amfani ya ƙayyade. Tana guje wa abubuwan gargajiya kamar:

  • Launin Rubutu: Inda ake kuskuren fassara inuwa a matsayin nunawa.
  • Cirewa mara Cikakke: Inda ake riƙe inuwowi masu laushi ko haske mai sarƙaƙi a wani ɓangare.
  • Cirewa Fiye da Kima: Inda ake kuskuren daidaita sahihancin canje-canjen kayan aiki.

Musayar ita ce buƙatar ƙaramin shigarwar mai amfani, wanda aka sanya shi a matsayin farashi mai daraja don tabbatar da daidaito da aka yi niyya.

5. Tsarin Nazari & Nazarin Lamari

Hangen Masanin Nazari: Babban Fahimta, Kwararren Tsari, Ƙarfi & Kurakurai, Bayanai masu Aiki

Babban Fahimta: Aikin Gong da Finlayson juyi ne mai amfani a cikin daukar hoto na lissafi. Sha'awar fagen da cikakken atomatik sau da yawa ya ci karo da bango tare da gaskiyar rashin tsari na hanyoyin hoto marasa layi da lissafin fuskoki masu sarƙaƙiya. Babban fahimtarsu tana da hazaka cikin sauƙinta: yi amfani da fahimtar fahimtar ɗan adam mafi girma na "menene inuwa" don ƙaddamar da algorithm mai tushe a zahiri. Wannan hanyar haɗin gwiwar ta yarda da abin da masu aikin ilmantarwa mai zurfi ke sake gano shi a yanzu—cewa wasu ayyuka sun fi sauƙi ga mutane su ƙayyade fiye da yadda algorithms za su iya ƙididdige su daga ƙa'idodin farko. Tana magance ƙafar Achilles na hanyoyin da suka gabata na rage yawan entropy, wanda, kamar yadda marubutan suka lura, ya gaza sosai akan ainihin hotunan masu amfani (hotunan iyali, hotunan yanar gizo) inda ake son gyaran haske.

Kwararren Tsari: Hankali yana da kyau a ragewa. 1) Yardana da samfurin zahiri (hasken Planckian, na'urori masu auna firikwensin layi) ba cikakkiyar dacewa ba ce ga bayanan shigarwa. 2) Maimakon tilasta dacewa ta duniya, sa matsalar ta zama ta gida. Bari mai amfani ya gano faci inda samfurin ya kamata ya riƙe (misali, "wannan ciyawa ce duka, amma wani ɓangare yana cikin rana, wani ɓangare a cikin inuwa"). 3) Yi amfani da wannan tsaftataccen bayanan gida don ƙididdige sigogin samfurin da aminci. 4) Aiwatar da samfurin da aka daidaita yanzu a duniya. Wannan kwarara daga daidaitawar gida zuwa aikace-aikacen duniya shine sirrin hanyar, yana kwatanta dabarun a cikin dorewar launi inda wani "faci fari" da aka sani zai iya daidaita dukan fage.

Ƙarfi & Kurakurai: Babban ƙarfin shine ƙarfin amfani. Ta hanyar ketare buƙatar shigarwar RAW mai layi, yana aiki akan kashi 99% na hotunan da mutane ke da su a zahiri. Mu'amalar mai amfani, ko da yake a'a daga mahangar atomatik ta tsantsa, ita ce babban ƙarfinta na zahiri—yana sa tsarin ya zama mai hasashe da sarrafawa. Babban aibin shi ne maida hankali kawai akan vector haske guda ɗaya. Fuskoki masu sarƙaƙi tare da fitilu masu launi da yawa (misali, hasken cikin gida tare da fitilu da tagogi) zai buƙaci zane-zane da yawa da ƙarin samfurin rarrabuwa mai sarƙaƙi, wuce tsinkaya ta alkibla ɗaya. Bugu da ƙari, hanyar tana ɗauka cewa zanen mai amfani "daidai" ne—zaɓin yanki na nunawa iri ɗaya. Zanen kuskure zai iya haifar da cirewa kuskure ko gabatar da kayan tarihi.

Bayanai masu Aiki: Ga masu bincike, wannan takarda ta zama tsarin aiki don hango kwamfuta mai amfani a cikin tsarin. Mataki na gaba a bayyane yake: maye gurbin zane mai sauƙi da mu'amala mafi ƙware (misali, rubutu akan "inuwa" da "nunawa") ko amfani da AI na rabe-rabe na farko don ba da shawarar yankin ga mai amfani. Ga masana'antu, wannan fasahar ta cika don haɗawa cikin kayan aikin gyara hoto kamar Adobe Photoshop ko GIMP a matsayin "Cire Inuwa" ko "Daidaitu Haske" na musamman. Farashin lissafi yana da ƙasa sosai don samun hasashe na ainihi. Hanyar da ta fi ban sha'awa ita ce yi amfani da wannan hanyar don samar da bayanan horo don cikakkun tsarin atomatik. Mutum zai iya amfani da kayan aikin mu'amala don ƙirƙirar babban tarin bayanan hotuna biyu (tare da kuma ba tare da takamaiman inuwowi ba) don horar da hanyar sadarwa mai zurfi, kamar yadda CycleGAN ke amfani da bayanan da ba a haɗa su ba don koyon canja wurin salo. Wannan yana haɗa tazarar tsakanin daidaiton kayan aikin mu'amala da sauƙin atomatik.

6. Ayyuka na Gaba & Hanyoyi

  • Kayan Aikin Gyara Hotuna na Ci Gaba: Haɗawa a matsayin kayan aikin goga a cikin software na ƙwararru da na mabukaci don daidaitaccen sarrafa inuwa/haske.
  • Gyara Kafin Aiki don Tsarin Hangen Nesa: Samar da shigarwar da bai dogara da haske ba don ingantaccen gano abu, gane, da bin diddigin a cikin sa ido, motoci masu cin gashin kansu, da na'urori masu aiki da kansu, musamman a cikin wurare masu ƙarfi, inuwowi masu canzawa.
  • Ƙara Bayanai don Koyon Injina: Bambance-bambancen yanayin haske a cikin tarin bayanan horo don inganta ƙirar ƙira, kamar yadda aka bincika a fagage kamar gane fuska don rage son zuciya ga haske.
  • Ƙara & Gaskiyar Zamani: Daidaita haske na ainihi don shigar da abu mai daidaito da tsarin fage.
  • Gado na Al'adu & Takardu: Cire inuwowi masu raɗaɗi daga hotunan takardu, zane-zane, ko wuraren binciken kayan tarihi don ingantaccen nazari.
  • Bincike na Gaba: Tsawaita samfurin don sarrafa launukan haske da yawa, haɗawa tare da ilmantarwa mai zurfi don ba da shawarar zane atomatik, da bincika daidaiton lokaci don sarrafa bidiyo.

7. Nassoshi

  1. Gong, H., & Finlayson, G. D. (Shekara). Interactive Illumination Invariance. Jami'ar Gabashin Anglia.
  2. Bell, S., Bala, K., & Snavely, N. (2014). Intrinsic Images in the Wild. ACM Transactions on Graphics (TOG), 33(4), 1–12.
  3. Finlayson, G. D., Drew, M. S., & Lu, C. (2009). Entropy Minimization for Shadow Removal. International Journal of Computer Vision (IJCV), 85(1), 35–57.
  4. Zhu, J.-Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. IEEE International Conference on Computer Vision (ICCV). (CycleGAN)
  5. Land, E. H., & McCann, J. J. (1971). Lightness and Retinex Theory. Journal of the Optical Society of America, 61(1), 1–11.
  6. Barron, J. T., & Malik, J. (2015). Shape, Illumination, and Reflectance from Shading. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 37(8), 1670–1687.
  7. Google AI Blog & MIT CSAIL wallafe-wallafen kan hotuna na ciki da gano inuwa.