Deep image and video compression

dc.contributor.advisorKamper, M. J.en_ZA
dc.contributor.advisorEngelbrecht, H. A.en_ZA
dc.contributor.authorNortje, Andreen_ZA
dc.contributor.otherStellenbosch University. Faculty of Engineering. Dept. of Electrical and Electronic Engineering.en_ZA
dc.date.accessioned2020-02-24T12:26:26Z
dc.date.accessioned2020-04-28T12:22:29Z
dc.date.available2020-02-24T12:26:26Z
dc.date.available2020-04-28T12:22:29Z
dc.date.issued2020-04
dc.descriptionThesis (MEng)--Stellenbosch University, 2020.en_ZA
dc.description.abstractENGLISH ABSTRACT: Forecasts indicate that video will make up 82% of all Internet traffic by 2022. Advancing video compression efficiency will play a crucial role in curbing high bitrates and mitigating excessive bandwidth consumption. To this end, recent deep learning models are emerging as likely successors to hand-tuned standard video codecs. Our goal is to further refine the compression quality of existing video codecs by improving their ability to predict video content. We subdivide video compression into two focus areas: 1. Still image compression of video frames, for which we propose the Binary Inpainting Network (BINet). 2. Motion compression in video, for which we learn binary motion codes (P-FrameNet and B-FrameNet). With BINet we learn to inpaint an image patch from the binary codes of its nearest neighbours to better compress a still image or single video frame (intra-frame compression). We adapt BINet to perform inter-frame prediction with P-FrameNet and B-FrameNet by learning binary motion codes that compensate for the relative displacement undergone by objects in a video sequence across time. Within the context of video compression our prediction methods are, to the best of our knowledge, the first fully parallelisable means of video intra-frame and inter-frame prediction. We show how inclusion of the BINet framework improves the intra-frame compression of a competitive deep image codec across a range of bitrates such that it outperforms the standard image codec JPEG. Experiments also highlight that its full-context patch inpaitings are of a higher quality than those sequentially predicted by the standard image codec WebP. In terms of inter-frame video prediction, we show that our learned binary motion codes describe more complex motion than the block-based optical flow algorithms employed by the standard video codecs: H.264 and H.265. This indicates that the BINet and our learned binary motion codes could be valuable extensions to existing video codecs, specifically in improving their intra-frame and inter-frame compression capabilities.en_ZA
dc.description.abstractAFRIKAANSE OPSOMMING: Voorspellings dui daarop dat video teen 2022, 82% van alle internetverkeer sal uitmaak. Die bevordering van videokompressie doeltreffenheid sal ’n belangrike rol speel in die bekamping van hoë bitrates en die vermindering van buitensporige bandwydte verbruik. Met die oog hierop verskyn die onlangse diepleermodelle as waarskynlike opvolgers vir die standaard handgestemde videokodekse. Ons doel is om die kompressiekwaliteit van bestaande videokodekse verder te verfyn deur hul vermoë om video-inhoud te voorspel, te verbeter. Ons verdeel videokompressie in twee fokusareas: 1. Stilbeeldkompressie van videorame, waarvoor ons die ‘Binary Inpainting Network’ (BINet) voorstel. 2. Bewegingskompressie in video, waarvoor ons binêre bewegingskodes leer (P-FrameNet and B-FrameNet). Deur die gebruik van BINet, leer ons om ’n beeldpatroon uit die binêre kodes van sy naaste bure te ‘inpaint’ om ’n enkele videoraam (kompressie binne raam) beter saam te druk. Ons pas BINet aan om interraamvoorspellings uit te voer met P-FrameNet en B-FrameNet deur binêre bewegings kodes te leer wat kompenseer vir die relatiewe verplasing wat deur voorwerpe in ’n videosekwensie oor tyd heen ondergaan word. BINet is binne die konteks van videokompressie, na die beste van ons wete, die eerste volledige parallelle middle van voorspelling van videorame. Ons bewys hoe die insluiting van die BINet-raamwerk die kompressie binne die raam van ’n mededingende diepbeeldkodek oor ’n reeks bitrates verbeter sodat dit die standaard-beeldkodek JPEG oortref. Eksperimente beklemtoon ook dat die volledige konteks van kol ‘inpaintings’ van hoër gehalte is as dié wat opeenvolgend voorspel word deur die standaard-beeldkodek WebP. In terme van voorspelling tussen raamwerke, toon ons aan dat ons aangeleerde binêre bewegingskodes meer ingewikkelde beweging beskryf as die blokgebaseerde optiese vloeialgoritmes wat gebruik word deur die standaard-videokodekse: H.264 en H.265. Dit dui daarop dat die BINet en ons aangeleerde binêre bewegingskodes waardevolle uitbreidings vir bestaande videokodekse kan wees, veral om hul binne-raam en interraam kompressievermoe te verbeter.af_ZA
dc.description.versionMastersen_ZA
dc.format.extentxvi, 89 leaves : illustrations (some color)
dc.identifier.urihttp://hdl.handle.net/10019.1/108155
dc.language.isoenen_ZA
dc.publisherStellenbosch : Stellenbosch Universityen_ZA
dc.rights.holderStellenbosch Universityen_ZA
dc.subjectImage processing -- Digital techniquesen_ZA
dc.subjectVideo compressionen_ZA
dc.subjectDigital video -- Editing -- Data processingen_ZA
dc.subjectUCTDen_ZA
dc.subjectMachine learningen_ZA
dc.subjectBinary motion codesen_ZA
dc.titleDeep image and video compressionen_ZA
dc.typeThesisen_ZA
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
nortje_deep_2020.pdf
Size:
11.16 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: