Probabilistic graphical modelling of seismic data processing in mining

Du Toit, Cornel (2020-04)

Thesis (PhD)--Stellenbosch University, 2020.

Thesis

ENGLISH ABSTRACT: Mining has long been characterised by deep shafts and dangerous conditions. Accurate monitoring and prediction of seismic activity and rockfalls are matters of life and death. The Institute of Mine Seismology (IMS) is the world’s largest independent organisation that provides worldwide mine seismic data processing using human data processors. Approximately 35000 seismic events are processed per day by a team of 65 data processors (24 hours a day, 365 days a year) in order to provide rapid data assessments to the mine, typically within minutes of the event being recorded by the seismic network. This aim is achievable only with the assistance of automatic, computer-based, data processing. While automatic processing is common in natural earthquake seismology, in mining-induced seismology the problem is more complex, and an automatic processor is yet to be developed. In mine seismology, classification of the recorded data is essential as there are many sources of noise in mines. Furthermore, with dense seismic sensor arrays in seismically active mines, multiple signals associated with both seismic events and noise sources may be conflated into a single seismogram. The matching of a Pressure (P)- and Shear (S)-wave for a specific seismic event in the presence of multiple sensors is not a simple task, even when analysed by an experienced seismologist. In this dissertation, an automatic method based on probabilistic graphical models for both the event classification (seismic event, blast or rejected event) and the determination of the phase arrival times (P- and S-wave) is investigated. This machine learning approach has lead to higher reliability, faster availability of results, more satisfied clients, less organisational load as well as a financial advantage to IMS and its clients. By using Hidden Markov Models (HMM) as classification tool, different characteristics of the wave can be analysed for classification. By identifying the most likely hidden states (P-wave and S-wave) using the Viterbi algorithm combined with standard short-term average (STA) and long-term average (LTA) analysis, the candidate phase arrivals for each sensor are determined. The probability of each candidate phase arrival being the true arrival is seen as a parameter, expressed as a mixing weight, through the introduction of latent variables. The latent variables, together with the seismic event location parameters (3D multi-sensor and origin time), are written as a probabilistic graphical model (PGM) which turns out to be a hierarchical Bayesian network. In most cases, the maximum a posterior (MAP) estimates of the latent variables are the true phase arrivals. In cases where the optimisation technique failed to deliver the MAP estimates e.g. got stuck in local maxima, outlier detection techniques are used to identify spurious events. Of a total of 80 mines, the automatic processor which forms the subject of this dissertation is currently being tested on the 25 most seismically active ones. Of an average 35000 daily events (based on all 80 mines), 60% can be successfully processed. The average quality control score of the automatic processor is slightly higher than the average human quality score at a fraction of cost.

AFRIKAANSE OPSOMMING: Mynbou word lank reeds gekenmerk deur diep skagte en gevaarlike toestande. Akkurate monitering en voorspelling van seismiese aktiwiteit en rotsbarste kan gesien word as sake van lewe en dood. Ter verduideliking: Hier word die term “seismic event” eenvoudig vertaal as “skudding”. Die Instituut van Myn Seismologie (IMS) is die wˆereld se grootste onafhanklike organisasie wat wˆereldwye myn-verwante seismiese dataverwerking aanbied as een van sy dienste deur gebruik te maak van dataprosesseerders. Ongeveer 35000 skuddings word daagliks verwerk deur ’n span van 65 dataprosesseerders (24 uur per dag, 365 dae per jaar) met die doel om die verwerkte data in die myn se databasis op te dateer, minute nadat die skudding plaasgevind het. Hierdie doel kan net bereik word met die hulp van ’n outomatiese, rekenaargebaseerde dataverwerker. Terwyl outomatiese verwerkers in natuurlike aardbewing seismologie reeds bestaan, is die probleem in mynbou-ge¨ınduseerde seismisiteit meer kompleks en daar is nog nie ’n outomatiese verwerker ontwikkel nie. Klassifikasie van data is noodsaaklik in myn seismologie aangesien daar baie bronne van geraas is. Verder, in die geval van aktiewe myne met digte seismiese netwerke, kan daar dikwels meervoudige seine, geassosieer met beide skuddings en geraasbronne, in ’n enkele seismogram aangeteken word. Dit is dan selfs vir ervare seismolo¨e moeilik om te bepaal watter druk (P)- en skuif (S)- golwe by mekaar pas wanneer dit gegenereer is deur ’n spesifieke skudding en aangeteken word by meervoudige seismiese sensore. ’n Outomatiese metode, gebaseer op probabilistiese grafiese modelle vir beide die klassifikasie (skudding, ontploffing of verwerpte rekord) en die bepaling van die fase-aankomstye (P- en S-golf) word ondersoek in hierdie proefskrif. Hierdie masjienleerbenadering het gelei tot ho¨er betroubaarheid, vinniger beskikbaarheid van resultate, meer tevrede kli¨ente, minder organisatoriese betrokkenheid, sowel as finansi¨ele voordele vir IMS en sy kli¨ente. “Hidden Markov Models” (HMM) word gebruik as klassifikasie instrument en verskillende kenmerke van die golf word ontleed. Die kandidaat fase-aankomstye van die P- en S- golwe vir sensore word bepaal deur: a) identifisering van die mees moontlike verborge toestande deur gebruik van die Viterbi-algoritme en b) kombinasie met die standaard korttermyn gemiddelde (STA) en langtermyn (LTA) analise. Die waarskynlikheid dat elke kandidaat fase aankoms die ware aankoms is, word as parameters gesien, uitgedruk as ’n menggewig deur die gebruik van latente veranderlikes. Die latente veranderlikes tesame met die liggingparameters (3Dmiddelpunt en oorsprongstyd), word geskryf as ’n probabilistiese grafiese model (PGM) en blyk dan om ’n hi¨erargiese Bayesiese netwerk te wees. In die meeste gevalle is die maksimum a posteriori (MAP) skattings van die latente veranderlikes die ware fase aankomste. In gevalle waar die optimaliseringstegniek misluk het om die MAP-ramings te lewer, d.w.s. waar dit vasval in plaaslike maksima, word uitskieter-opsporingstegnieke gebruik om verdagte skuddings te identifiseer. Die outomatiese verwerker waaroor hierdie proefskrif handel word tans getoets op 25 van die mees seismies-aktiewe myne. 60% van ’n gemiddelde 35000 daagliks aangetekende rekords (gebaseer op al 80 myne), word suksesvol verwerk deur die outomatiese prosesseerder. Die gemiddelde kwaliteitskontrole telling van die outomatiese verwerker is effens bokant die gemiddelde kwaliteit-telling van die menslike prosesseerders teen fraksie van koste.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/107769
This item appears in the following collections: