Deep image and video compression

Nortje, Andre

Deep image and video compression

dc.contributor.advisor	Kamper, M. J.	en_ZA
dc.contributor.advisor	Engelbrecht, H. A.	en_ZA
dc.contributor.author	Nortje, Andre	en_ZA
dc.contributor.other	Stellenbosch University. Faculty of Engineering. Dept. of Electrical and Electronic Engineering.	en_ZA
dc.date.accessioned	2020-02-24T12:26:26Z
dc.date.accessioned	2020-04-28T12:22:29Z
dc.date.available	2020-02-24T12:26:26Z
dc.date.available	2020-04-28T12:22:29Z
dc.date.issued	2020-04
dc.description	Thesis (MEng)--Stellenbosch University, 2020.	en_ZA
dc.description.abstract	ENGLISH ABSTRACT: Forecasts indicate that video will make up 82% of all Internet traffic by 2022. Advancing video compression efficiency will play a crucial role in curbing high bitrates and mitigating excessive bandwidth consumption. To this end, recent deep learning models are emerging as likely successors to hand-tuned standard video codecs. Our goal is to further refine the compression quality of existing video codecs by improving their ability to predict video content. We subdivide video compression into two focus areas: 1. Still image compression of video frames, for which we propose the Binary Inpainting Network (BINet). 2. Motion compression in video, for which we learn binary motion codes (P-FrameNet and B-FrameNet). With BINet we learn to inpaint an image patch from the binary codes of its nearest neighbours to better compress a still image or single video frame (intra-frame compression). We adapt BINet to perform inter-frame prediction with P-FrameNet and B-FrameNet by learning binary motion codes that compensate for the relative displacement undergone by objects in a video sequence across time. Within the context of video compression our prediction methods are, to the best of our knowledge, the first fully parallelisable means of video intra-frame and inter-frame prediction. We show how inclusion of the BINet framework improves the intra-frame compression of a competitive deep image codec across a range of bitrates such that it outperforms the standard image codec JPEG. Experiments also highlight that its full-context patch inpaitings are of a higher quality than those sequentially predicted by the standard image codec WebP. In terms of inter-frame video prediction, we show that our learned binary motion codes describe more complex motion than the block-based optical flow algorithms employed by the standard video codecs: H.264 and H.265. This indicates that the BINet and our learned binary motion codes could be valuable extensions to existing video codecs, specifically in improving their intra-frame and inter-frame compression capabilities.	en_ZA
dc.description.abstract	AFRIKAANSE OPSOMMING: Voorspellings dui daarop dat video teen 2022, 82% van alle internetverkeer sal uitmaak. Die bevordering van videokompressie doeltreffenheid sal ’n belangrike rol speel in die bekamping van hoë bitrates en die vermindering van buitensporige bandwydte verbruik. Met die oog hierop verskyn die onlangse diepleermodelle as waarskynlike opvolgers vir die standaard handgestemde videokodekse. Ons doel is om die kompressiekwaliteit van bestaande videokodekse verder te verfyn deur hul vermoë om video-inhoud te voorspel, te verbeter. Ons verdeel videokompressie in twee fokusareas: 1. Stilbeeldkompressie van videorame, waarvoor ons die ‘Binary Inpainting Network’ (BINet) voorstel. 2. Bewegingskompressie in video, waarvoor ons binêre bewegingskodes leer (P-FrameNet and B-FrameNet). Deur die gebruik van BINet, leer ons om ’n beeldpatroon uit die binêre kodes van sy naaste bure te ‘inpaint’ om ’n enkele videoraam (kompressie binne raam) beter saam te druk. Ons pas BINet aan om interraamvoorspellings uit te voer met P-FrameNet en B-FrameNet deur binêre bewegings kodes te leer wat kompenseer vir die relatiewe verplasing wat deur voorwerpe in ’n videosekwensie oor tyd heen ondergaan word. BINet is binne die konteks van videokompressie, na die beste van ons wete, die eerste volledige parallelle middle van voorspelling van videorame. Ons bewys hoe die insluiting van die BINet-raamwerk die kompressie binne die raam van ’n mededingende diepbeeldkodek oor ’n reeks bitrates verbeter sodat dit die standaard-beeldkodek JPEG oortref. Eksperimente beklemtoon ook dat die volledige konteks van kol ‘inpaintings’ van hoër gehalte is as dié wat opeenvolgend voorspel word deur die standaard-beeldkodek WebP. In terme van voorspelling tussen raamwerke, toon ons aan dat ons aangeleerde binêre bewegingskodes meer ingewikkelde beweging beskryf as die blokgebaseerde optiese vloeialgoritmes wat gebruik word deur die standaard-videokodekse: H.264 en H.265. Dit dui daarop dat die BINet en ons aangeleerde binêre bewegingskodes waardevolle uitbreidings vir bestaande videokodekse kan wees, veral om hul binne-raam en interraam kompressievermoe te verbeter.	af_ZA
dc.description.version	Masters	en_ZA
dc.format.extent	xvi, 89 leaves : illustrations (some color)
dc.identifier.uri	http://hdl.handle.net/10019.1/108155
dc.language.iso	en	en_ZA
dc.publisher	Stellenbosch : Stellenbosch University	en_ZA
dc.rights.holder	Stellenbosch University	en_ZA
dc.subject	Image processing -- Digital techniques	en_ZA
dc.subject	Video compression	en_ZA
dc.subject	Digital video -- Editing -- Data processing	en_ZA
dc.subject	UCTD	en_ZA
dc.subject	Machine learning	en_ZA
dc.subject	Binary motion codes	en_ZA
dc.title	Deep image and video compression	en_ZA
dc.type	Thesis	en_ZA

Files

Original bundle

Now showing 1 - 1 of 1

Name:: nortje_deep_2020.pdf
Size:: 11.16 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Masters Degrees (Electrical and Electronic Engineering)