1. Gabatarwa
Yawaitar na'urorin hannu ya haifar da buƙatar ci-gaban aikace-aikacen Ƙarfafa Gaskiya (AR), kamar haɓaka fage mai kama da gaskiya da kasancewa ta nesa. Ginin ginshiƙi na irin waɗannan aikace-aikacen shine ingantaccen tsinkayar haske mai daidaitawa daga hotuna guda ɗaya ko jerin bidiyo. Wannan aiki yana da wahala musamman a cikin wuraren cikin gida saboda rikitarwar hulɗar siffofi, kayan aiki, da hanyoyin haske daban-daban, galibi suna haɗawa da hulɗar nisa da rufewa.
Shigarwar daga na'urorin masu amfani galibi hotuna ne na Ƙaramin Yankin Aiki (LDR) masu ƙarancin filin gani (misali, suna ɗaukar kusan kashi 6% na fage mai faɗi). Babban ƙalubale, don haka, shine tsinkaya bayanan HDR da suka ɓace da kuma fahimtar sassan fagen da ba a iya gani ba (kamar hanyoyin haske a wajen firam) don samar da cikakkiyar ƙirar haske mai daidaitawa a sarari. Bugu da ƙari, don shigarwar bidiyo, dole ne tsinkayoyin su kasance masu kwanciyar hankali na ɗan lokaci don guje wa ƙyalli ko sauye-sauye masu ban tsoro a cikin abubuwan da aka haɗa na AR.
Wannan takarda ta gabatar da tsarin farko da aka ƙera don cimma tsinkayar haske na cikin gida na HDR mai daidaitawa a sarari da lokaci. Yana tsinkaya haske a kowane matsayi na hoto daga hoton LDR guda ɗaya da taswirar zurfi, kuma idan aka ba da jerin bidiyo, yana haɓaka tsinkayoyi yayin da yake kiyaye daidaiton lokaci mai santsi.
2. Hanyar Aiki
Tsarin da aka gabatar tsarin koyon zurfi ne mai ɗimbin sassa, wanda aka ƙirƙira bisa ilimin kimiyyar lissafi.
2.1. Ƙarar Haske ta Gaussian Mai Siffar Kwalliya (SGLV)
Babban wakilci shine Ƙarar Haske ta Gaussian Mai Siffar Kwalliya (SGLV). Maimakon tsinkaya taswirar muhalli guda ɗaya don dukan fage, hanyar tana sake gina ƙarar 3D inda kowane voxel ya ƙunshi sigogi don saitin Gaussians masu siffar kwalliya (SGs) waɗanda ke wakiltar rarraba haske na gida. Gaussians masu siffar kwalliya ingantacciyar kima ce don haske mai rikitarwa, wanda aka ayyana kamar haka: $G(\mathbf{v}; \mathbf{\mu}, \lambda, a) = a \cdot e^{\lambda(\mathbf{\mu} \cdot \mathbf{v} - 1)}$ inda $\mathbf{\mu}$ shine gatari na lobe, $\lambda$ shine kaifin lobe, kuma $a$ shine girman lobe. Wannan wakilcin ƙarar yana da mahimmanci don cimma daidaiton sarari.
2.2. Tsarin 3D Mai Rufe-fadi da Buɗewa
Cibiyar sadarwar 3D mai rufe-fadi da buɗewa da aka keɓance tana ɗaukar hoton LDR da shigar da taswirar zurfinsa (wanda aka daidaita zuwa sararin 3D gama gari) kuma tana fitar da SGLV. Mai rufe-fadi yana ciro siffofi masu yawa, yayin da mai buɗewa yana haɓaka girma don sake gina ƙarar mai ƙima.
2.3. Binciken Hasken Ƙarar don Daidaitawar Sarari
Don tsinkaya taswirar muhalli don takamaiman ra'ayi (misali, don shigar da abu na zahiri), tsarin yana aiwatar da binciken hasken ƙarar ta hanyar SGLV. Ana jefa haskoki daga wurin da aka yi niyya, kuma gudummawar haske tare da kowace hanyar haske ana haɗa su ta hanyar samfuri da haɗa sigogin SG daga voxels da suka haɗu. Wannan tsari na zahiri yana tabbatar da cewa tsinkayoyin haske suna da daidaito a cikin siffofi a wurare daban-daban a cikin fage.
2.4. Cibiyar Sadarwar Haɗawa ta Hybrid don Taswirorin Muhalli
Ana ciyar da sigogin SG ɗanyen daga binciken haske zuwa cibiyar sadarwar haɗawa ta hybrid. Wannan cibiyar sadarwar tana haɓaka ƙididdigar haske mai ƙarancin ƙima zuwa cikakkiyar taswirar muhalli ta HDR mai ƙima, tana dawo da cikakkun bayanai kamar nunin haske daga saman da ake iya gani.
2.5. Layer na Zane-zane na Monte-Carlo a Cikin Cibiyar Sadarwa
Wani sabon abu mai mahimmanci shine Layer na zane-zane na Monte-Carlo a cikin cibiyar sadarwa. Wannan Layer yana ɗaukar taswirar muhalli ta HDR da aka tsinkaya da ƙirar 3D na abu na zahiri, yana zana shi tare da binciken hanya, kuma yana kwatanta sakamakon da zane na gaskiya. Gradient daga wannan asarar mai kama da gaskiya ana mayar da shi ta hanyar tsarin tsinkayar haske, yana daidaitawa kai tsaye don burin ƙarshe na shigar da abu mai kama da gaskiya.
2.6. Cibiyoyin Sadarwar Juyawa don Daidaitawar Lokaci
Don shigarwar jerin bidiyo, tsarin ya haɗa da Cibiyoyin Sadarwar Juyawa (RNNs). RNNs suna tattara bayanai daga firam ɗin da suka gabata, suna ba da damar tsarin haɓaka SGLV yayin da ake ganin ƙarin fage. Mafi mahimmanci, suna tilasta sauƙaƙe sauye-sauye tsakanin tsinkayoyi a cikin firam ɗin da ke biye da juna, suna kawar da ƙyalli kuma suna tabbatar da daidaiton lokaci.
3. Haɓaka Bayanan: OpenRooms
Horar da irin wannan ƙirar mai buƙatar bayanai yana buƙatar babban bayanan fage na cikin gida tare da hasken gaskiya na HDR. Marubutan sun haɓaka bayanan jama'a na OpenRooms sosai. Ingantaccen sigar ya haɗa da kusan taswirorin muhalli 360,000 na HDR a mafi girman ƙuduri da jerin bidiyo 38,000, duk an zana su ta amfani da binciken hanya mai saurin GPU don daidaiton zahiri. Wannan bayanan gudummawa ce mai girma ga al'umma.
Ƙididdigar Bayanai
360K Taswirorin Muhalli na HDR
38K Jerin Bidiyo
Gaskiya Mai Binciken Hanya
4. Gwaje-gwaje da Sakamako
4.1. Saitin Gwaji
An kimanta tsarin daidai da hanyoyin tsinkayar haske na zamani na hoto guda ɗaya (misali, [Gardner et al. 2017], [Song et al. 2022]) da na tushen bidiyo. Ma'auni sun haɗa da ma'auni na tushen hoto (PSNR, SSIM) akan abubuwan da aka zana, da ma'auni na fahimta (LPIPS) da nazarin masu amfani don kimanta kamannin gaskiya.
4.2. Sakamako na Ƙididdiga
Hanyar da aka gabatar ta fi duk abubuwan da aka yi amfani da su a cikin kwatancin ƙididdiga. Ta sami maki mafi girma na PSNR da SSIM don zane-zane na abubuwa na zahiri, yana nuna ingantaccen tsinkayar haske. Makin ma'auni na fahimta (LPIPS) shima ya fi girma, yana nuna cewa sakamakon ya fi kama da gaskiya ga masu kallo.
4.3. Sakamako na Halayya da Kwatancin Gani
Sakamako na halayya, kamar yadda aka nuna a Hoto na 1 na PDF, yana nuna fa'idodi masu mahimmanci:
- Dawo da Hanyoyin Haske da ba a iya gani ba: Hanyar ta yi nasarar fahimtar kasancewa da kaddarorin hanyoyin haske a wajen filin gani na kyamara.
- Cikakkun Bayanai na Nunin Haske na Surface: Taswirorin muhalli da aka tsinkaya sun ƙunshi nunin haske masu kaifi, daidai na saman daki da ake iya gani (bangon, kayan daki), waɗanda ke da mahimmanci don zane-zane na madubi da abubuwa masu haske.
- Daidaitawar Sarari: Abubuwan zahiri da aka saka a wurare daban-daban a cikin fage ɗaya suna nuna haske wanda ya yi daidai da siffar gida da hasken duniya.
- Santsin Lokaci: A cikin jerin bidiyo, hasken kan abubuwan da aka saka yana haɓaka sannu a hankali yayin da kyamara ke motsawa, ba tare da fasalin popping ko ƙyalli da aka saba yi a cikin hanyoyin firam-zuwa-firam ba.
4.4. Nazarin Cirewa
Nazarin cirewa ya tabbatar da muhimmancin kowane ɓangare:
- Cire SGLV da binciken hasken ƙarar ya haifar da tsinkayoyi marasa daidaito a sarari.
- Barin Layer na zane-zane na Monte-Carlo a cikin cibiyar sadarwa ya haifar da ƙarancin shigar da abubuwa masu kama da gaskiya, duk da kyawawan ma'auni na taswirar muhalli.
- Kashe RNNs don sarrafa bidiyo ya haifar da ƙyalli na lokaci da ake iya gani.
5. Cikakkun Bayanai na Fasaha da Tsarin Lissafi
Aikin asara manufa ce mai ɗimbin lokuta: $\mathcal{L} = \mathcal{L}_{env} + \alpha \mathcal{L}_{render} + \beta \mathcal{L}_{temp}$
- $\mathcal{L}_{env}$: Asarar L2 tsakanin taswirorin muhalli na HDR da aka tsinkaya da na gaskiya.
- $\mathcal{L}_{render}$: Asarar zane-zane mai kama da gaskiya daga Layer na Monte-Carlo a cikin cibiyar sadarwa. Ana ƙididdige wannan kamar bambanci tsakanin abu na zahiri da aka zana ta amfani da hasken da aka tsinkaya da zanen gaskiya na binciken hanya.
- $\mathcal{L}_{temp}$: Asarar santsin lokaci da aka yi amfani da ita ga sigogin SGLV a cikin firam ɗin da ke biye da juna a cikin jerin bidiyo, wanda RNNs suka tilasta.
6. Tsarin Nazari: Fahimtar Asali & Kwararar Ma'ana
Fahimtar Asali: Cigaban asali na takarda ba kawai cibiyar sadarwar jijiya mafi kyau don taswirorin muhalli ba ne; shine fahimtar cewa haske fasalin filin 3D ne, ba kayan ado na 2D mai dogaro da ra'ayi ba. Ta hanyar canza fitarwa daga panorama 2D zuwa Ƙarar Haske ta Gaussian Mai Siffar Kwalliya (SGLV) na 3D, marubutan sun warware matsalar daidaiton sarari a tushenta. Wannan tsalle ne na ra'ayi mai kama da canji daga zane-zane na tushen hoto zuwa filayen haske na jijiya (NeRF) [Mildenhall et al. 2020]—yana motsa wakilci zuwa cikin sararin 3D na asali na fage. Mai zane-zane na Monte-Carlo a cikin cibiyar sadarwa shine babban nasara na biyu, yana ƙirƙira hanyar haɗin kai kai tsaye, mai tushen gradient tsakanin ƙididdigar haske da ma'auni na ƙarshe na nasara: kamannin gaskiya a cikin haɗin AR.
Kwararar Ma'ana: Ma'anar gine-ginen ba ta da lahani. 1) Haɗa kai na 3D: Ana haɗa shigarwa (LDR + zurfi) zuwa ƙarar fasali ta 3D. 2) Sake Gina Haske na Ƙarar: Mai buɗewa yana fitar da SGLV—ƙirar haske mai sanin sarari. 3) Kimiyyar Lissafi Mai Bambanci: Binciken hasken ƙarar yana tambayar wannan ƙirar don kowane ra'ayi, yana tabbatar da daidaiton sarari ta hanyar gini. 4) Haɓaka Bayyanar & Daidaitawa Kai Tsaye: Cibiyar sadarwar 2D tana ƙara cikakkun bayanai masu yawa, kuma Layer na Monte-Carlo yana daidaitawa kai tsaye don ingancin zane na ƙarshe. 5) Haɗa kai na Lokaci: Don bidiyo, RNNs suna aiki azaman bankin ƙwaƙwalwar ajiya, suna haɓaka SGLV akan lokaci kuma suna tace fitarwa don santsi. Kowane mataki yana magance takamaiman rauni na fasahar da ta gabata.
7. Ƙarfi, Kurakurai, da Fahimta Mai Aiki
Ƙarfi:
- Wakilci na Asali: SGLV wakilci ne mai kyau, mai ƙarfi wanda zai iya yin tasiri ga aikin gaba fiye da ƙididdigar haske.
- Daidaitawa Har Ɗaya don Aikin: Mai zane-zane a cikin cibiyar sadarwa misali ne mai haske na ƙirar asara ta musamman, yana motsawa fiye da asarar wakili (kamar L2 akan taswirorin muhalli) don daidaitawa don ainihin manufa.
- Cikakkiyar Magani: Yana magance matsalolin hoto guda ɗaya da na bidiyo a cikin tsarin haɗin kai, yana magance daidaiton sarari DA lokaci—haɗin gwiwa da ba kasafai ba.
- Gudummawar Albarkatu: Ingantaccen bayanan OpenRooms babban kadari ne ga al'ummar bincike.
Kurakurai & Tambayoyi Masu Muhimmanci:
- Dogaro da Zurfi: Hanyar tana buƙatar taswirar zurfi. Duk da yake na'urori masu auna zurfi sun zama ruwan dare, aikin akan shigarwar RGB guda ɗaya ba a sani ba. Wannan yana iyakance amfani da kafofin watsa labarai na gado ko na'urori ba tare da auna zurfi ba.
- Farashin Lissafi: Horarwa ya haɗa da binciken hanya. Tsinkaya yana buƙatar binciken hasken ƙarar. Wannan ba maganin hannu mai sauƙi ba ne har yanzu. Takarda ba ta yi magana game da saurin tsinkaya ko matsawa ƙirar ba.
- Haɗawa zuwa Bayanan "A cikin Daji": An horar da ƙirar akan bayanan roba, na binciken hanya (OpenRooms). Ayyukansa akan hotunan wayar hannu na zahiri, masu hayaniya, marasa kyau—waɗanda galibi suna keta zato na zahiri na binciken hanya—ya kasance tambayar biliyan daloli don turawa AR.
- Shubuha na Kayan Aiki: Kamar duk ayyukan juyawa, ƙididdigar haske tana haɗuwa da ƙididdigar kayan saman. Tsarin yana ɗauka sanannen ko ƙididdigar siffa amma bai warware kayan aiki a sarari ba, yana iyakance daidaito a cikin fage masu rikitarwa, waɗanda ba na Lambertian ba.
Fahimta Mai Aiki:
- Ga Masu Bincike: Tsarin SGLV + binciken ƙarar shine mabuɗin abin da za a ɗauka. Bincika aikace-aikacensa ga ayyukan da suka shafi kamar haɗa ra'ayi ko ƙididdigar kayan aiki. Bincika dabarun koyar da kai ko daidaitawa lokacin gwaji don haɗa tazarar sim-to-real don bayanan wayar hannu na zahiri.
- Ga Injiniyoyi/Ƙungiyoyin Samfura: Yi la'akari da wannan a matsayin ma'auni na zinariya don ingantaccen AR. Don haɗin samfur na kusa, mayar da hankali kan tace wannan ƙirar (misali, ta hanyar tace ilimi [Hinton et al. 2015]) zuwa sigar da ke daɗaɗɗuwa da wayar hannu wanda zai iya gudana a lokacin gaskiya, watakila ta hanyar kusantar SGLV tare da ingantaccen tsarin bayanai.
- Ga Masu Dabarun Bayanai: An tabbatar da ƙimar ingantaccen bayanan roba. Zuba jari a cikin samar da ƙarin bayanan roba daban-daban, masu daidaiton zahiri waɗanda ke ɗaukar kewayon al'amuran haske (misali, caustics masu rikitarwa, kafofin watsa labarai masu shiga).
8. Duban Aikace-aikace da Hanyoyin Gaba
Aikace-aikace Nan da Nan:
- Ƙirƙirar Abun ciki na AR Mai Girma: Kayan aikin ƙwararru don fim, gine-gine, da ƙirar cikin gida inda shigar da abu na zahiri mai kama da gaskiya ke da mahimmanci.
- Kasancewa ta Nesa Mai Cike da Kula & Taro: Haskaka fuskar mai amfani daidai da muhallin nesa don kiran bidiyo mai kama da gaskiya.
- Ciniki ta Intanet & Dillali: Ba da damar abokan ciniki su hango samfura (kayan daki, kayan ado, na'urori) a cikin gidajensu ƙarƙashin ingantattun yanayin haske.
Hanyoyin Bincike na Gaba:
- Haɗaɗɗun Juyawa: Ƙaddamar da tsarin don haɗin gwiwar ƙididdigar haske, kayan aiki, da siffa daga shigarwar da ba ta da yawa, yana matsawa zuwa cikakkiyar hanyar fahimtar fage.
- Inganci da Turawa akan Na'ura: Bincike cikin matsawa ƙirar, ingantattun dabarun zane-zane na jijiya, da gine-ginen masu sanin kayan aiki don kawo wannan matakin inganci zuwa AR na wayar hannu na lokacin gaskiya.
- Sarrafa Haske Mai Ƙarfi: Aikin na yanzu yana mai da hankali kan fage masu tsayi. Babban iyaka shine ƙididdigewa da tsinkaya canje-canjen haske mai ƙarfi (misali, kunna fitilu/kashe, motsa hanyoyin haske, canza hasken rana).
- Haɗawa tare da Wakilcin Fage na Jijiya: Haɗa ra'ayin SGLV tare da wakilcin kai tsaye kamar NeRF ko 3D Gaussian Splatting [Kerbl et al. 2023] don ƙirƙirar cikakkiyar ƙirar fage ta jijiya mai bambanci, mai gyara.
9. Nassoshi
- Zhengqin Li, Li Yu, Mikhail Okunev, Manmohan Chandraker, Zhao Dong. "Spatiotemporally Consistent HDR Indoor Lighting Estimation." ACM Trans. Graph. (Proc. SIGGRAPH), 2023.
- Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng. "NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis." ECCV, 2020.
- Geoffrey Hinton, Oriol Vinyals, Jeff Dean. "Distilling the Knowledge in a Neural Network." arXiv:1503.02531, 2015.
- Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, George Drettakis. "3D Gaussian Splatting for Real-Time Radiance Field Rendering." ACM Trans. Graph., 2023.
- Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros. "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks." ICCV, 2017. (CycleGAN - an ambaci don ra'ayoyin daidaita yanki masu dacewa da sim-to-real).
- OpenRooms Dataset. https://openrooms.github.io/