The development of techniques to select a control policy during proactive on-line planning and control

Morris, J. W. (2001-12)

Thesis (MScEng)--Stellenbosch University, 2001.

Thesis

ENGLISH ABSTRACT: The worldwide trend for systems is to become more complex. This leads to the need for new ways to control these complex systems. A relatively new approach for controlling systems, called on-line planning and control, poses many potential benefits to a variety of end-users, especially in the manufacturing environment. Davis [3J developed a framework for on-line planning and control that is currently incomplete. This project aims to fill one of the gaps in the framework by automating one of the functions, eliminating the need for a human observer. This function, the real-time compromise analysis function, does the comparison of the statistical performance estimates to select a control policy for implementation in the system being controlled (the realworld system) at the current moment in time. In this project, two techniques were developed to automate the function. The first technique is based on a common technique for statistically comparing two systems, the paired-t confidence interval technique. The paired-t confidence interval technique is used to compare the control policies by building confidence intervals of the expected differences for the respective performance criteria and testing the hypothesis that the statistical performance estimates of the one control policy are better than those of the other control policy. The results of these comparisons are then consolidated into a compromise function that is used to determine the control policy to be implemented currently in the real-world system. The second developed technique is derived, but differs greatly, from Davis's [3J dominance probability density function approach, and it includes principles of the paired-t confidence interval technique. It compares the control policies by determining the probability (confidence level) with which one can assume that the performance criterion of the one control policy will provide a performance value that is better than the other's and vie-ursa. These confidence levels are then aggregated into a single compromise function that is used to determine the control policy to be implemented currently in the real-world system. After the techniques were developed, it was not possible to determine their efficiency mathematically, because their statistical base is suspect. The techniques needed to be implemented before they could be evaluated and it was decided to develop an emulator of the on-line planning and control process in accordance with the framework given by Davis [3J to implement them. This Emulator is in essence a Visual Basic" program that uses Arena" models. However, this Emulator needed certain deviations from the framework to make it possible. Firstly, while the systems that will be controlled with the on-line planning and control process will be complex systems, the system controlled in the Emulator is only a straightforward MlM/l/FIFO/OO / 00 system. This allowed for the conditions that have not been addressed sufficiently, e.g. the initialising of the system models, to be bypassed. Secondly, the Emulator does not include all parts of the framework, and parts for which the technology does not currently exist have been excluded. Thirdly, the real-world system is replaced with a model, because a real-world system was not available for the study. Finally, concurrent operations are actually done sequentially, but in a way that makes it seem that they were done concurrently, as not to influence the results. This Emulator was used to analyse both techniques for two different traffic intensities. The first part of the analysis consisted of an off-line non-terminating analysis of the individual control policies of the system. This was used as a base line against which the on-line planning and control process of the Emulator was evaluated. The findings of the evaluations were that, at the traffic intensities evaluated, the techniques provided results that were very similar to the results of the best individual control. From these results, it was speculated that at different traffic intensities, different control policies would be better than the techniques themselves, while the techniques will only give slightly worse results. In addition, because the on-line planning and control process attempts to respond to changing conditions, it can be assumed that the techniques will excel in those conditions where the input distribution is changing continuously. It is also speculated that the techniques may be advantageous in cases where it is not possible to determine beforehand which of the individual control policies to use because it is impossible to predict the input distribution that will occur. It is expected that the techniques will give good (but unfortunately, not necessarily the best) results for any input distribution, while an individual control policy that may give the best results for one input distribution, may prove disastrous for another input distribution. Three important conclusions can be made from the project. Firstly, it is possible to automate the real-time compromise analysis function. Secondly, an emulator can be developed to evaluate the techniques for the real-time compromise analysis. The greatest advantage of this Emulator is that it can run significantly faster than real-time, enabling the generation of enough data to make the significant statistical comparisons needed to evaluate the techniques. The final conclusion is that while initial evaluations are inconclusive, it can be shown that the techniques warrant further study. Three important recommendations cart be made from the project. Firstly, the techniques need to be studied further, because they cannot be claimed to be perfect, or that they are the only possible techniques that will work. In fact, they are merely techniques that may work and other techniques may still prove to be better. Secondly, because it would be foolhardy to assume that the Emulator is complete, the Emulator needs to be improved with the most critical need to develop the Emulator in a programming language and simulation package that allows concurrent operations and effortless initialisation. This will enable the Emulator to be much faster and a lot more flexible. The final recommendation is that the techniques need to be evaluated with other parameters in other increasingly complex systems, culminating in the evaluation of the on-line planning and control process with the techniques included in a real-world flexible manufacturing system. Only then can there be decided conclusively on whether the techniques are efficient or not. It is hoped that this project will form a valuable building block that will facilitate making on-line planning and control a viable alternative to controlling complex systems, enabling them to respond better to changing conditions that are currently becoming the norm.

AFRIKAANSE OPSOMMING: Wêreldwyd is stelsels besig om meer ingewikkeld te raak. Dit bring mee dat nuwe metodes benodig word om hierdie ingewikkelde stelsels te beheer. Gekoppelde beplanning en beheer ("On-line planning and control") is 'n relatiewe nuwe metode om stelsels te beheer en het baie moontlike voordele vir 'n verskeidenheid van gebruikers, veral in die vervaardigingsomgewing. Davis [3] het 'n raamwerk ontwikkel vir gekoppelde beplanning en beheer, maar die raamwerk is tans onvolledig. Hierdie projek het gepoog om een van die gapings in die raamwerk te vul deur een van die funksies te outomatiseer en sodoende die behoefte vir 'n menslike waarnemer te elimineer. Hierdie funksie, die intydse-kompromie-analise-funksie ("real-time compromise analysis function"), is verantwoordelik vir die vergelyking van die statistiese prestasieskattings om 'n beheerbeleid te kies wat geïmplementeer moet word in die stelsel wat beheer word (die regtewêreld -stelsel). Die projek het twee tegnieke ontwikkel om die funksie te outomatiseer. Die eerste tegniek is gebaseer op 'n algemene tegniek om twee stelsels statisties met mekaar te vergelyk, naamlik die gepaarde-t vertrouensinterval-tegniek. Die gepaarde-t vertrouensinterval-tegniek word gebruik om die beheerbeleide te vergelyk deur vertrouensintervalle te bou van die verwagte verskille vir die verskillende vertoningskriteria en om die hipotese te toets dat die statistiese prestasieskattings van die een beheerbeleid beter is as dié van 'n ander beheerbeleid. Die resultate van hierdie vergelykings word dan gekonsolideer in 'n kompromiefunksie wat gebruik word om te bepaal watter beheerbeleid tans geïmplementeer moet word in die regte-wêreld-stelsel. Die tweede ontwikkelde tegniek is afgelei, maar verskil baie, van Davis [3] se oorheersende waarskynlikheidsdigtheid-funksie ("dominance probability density function") -benadering en gebruik ook idees van die gepaarde-t vertrouensinterval-tegniek. Dit vergelyk die beheerbeleide deur die waarskynlikheid (vertrouensvlak) te bereken waarmee aanvaar kan word dat die vertoningskriterion van een van die beheerbeleide 'n beter vertoningswaarde sal hê as die ander, en omgekeerd. Hierdie vertrouensvlakke word dan gekonsolideer in 'n kompromiefunksie wat gebruik word om te bepaal watter beheerbeleid tans géimplementeer moet word in die regte wêreld stelsel. Nadat die tegnieke ontwikkel is, was dit nie moontlik om hulle effektiwiteit wiskundig te evalueer nie, want hulle statistiese basis is verdag. Dus moes die tegnieke geïmplementeer word voordat hulle geëvalueer kon word. Daar is besluit om 'n emuleerder van die proses van gekoppelde beplanning en beheer te ontwikkel volgens die raamwerk wat deur Davis [3] ontwikkel is sodat die tegnieke geïmplementeer kan word. Hierdie Emuleerder is 'n Visual Basic* program wat Arena" modelle gebruik. Om die Emuleerder moontlik te maak, was sekere afwykings van die raamwerk nodig. Die eerste hiervan is dat die stelsels wat beheer word met gekoppelde beplanning en beheer, komplekse stelsels is, maar dat die stelsel wat deur die Emuleerder beheer word, slegs 'n eenvoudige MIMI l/EIEBI 00 I 00 sisteem is. Dit maak dit moontlik om aspekte wat nog nie genoegsaam aangespreek is nie, byvoorbeeld die inisiëring van die stelselmodelle, te omseil. Tweedens bevat die Emuleerder nie al die dele van die raamwerk nie en dele waarvoor die tegnologie tans nog nie bestaan nie, is uitgelaat. Derdens, die regte wêreld stelsel is vervang met 'n model, want 'n regte wêreld stelsel was nie beskikbaar nie. Laastens is operasies wat eintlik gelyktydig gedoen moes word, sekwensieel gedoen, maar op so 'n marrier dat dit lyk asof hulle gelyktydig gedoen is, sodat die resultate nie beïnvloed word nie. Die Emuleerder is gebruik om beide tegnieke te analiseer vir twee verskillende verkeersdigthede. Die eerste deel van die analise het bestaan uit 'n nie-terminerende analise van die individuele beheerbeleide van die stelsel. Dit is gebruik as 'n basislyn waarteen die Emuleerder se proses van gekoppelde beplanning en beheer geëvalueer is. Die bevindinge van die evaluasie was dat vir die verkeersdigthede wat geëvalueer is, die tegnieke resultate lewer wat vergelykbaar is met die van die beste individuele beheerbeleide. Oor hierdie resultate is daar gespekuleer dat by verskillende verkeersdigthede, verskillende beheerbeleide beter sal vaar as die tegnieke, terwyl die tegnieke slegs marginale swakker resultate sal lewer. En omdat gekoppelde beplanning en beheer poog om te reageer op veranderende omstandighede, kan dit aanvaar word dat die tegnieke sal presteer in omstandighede waar die toevoerverdeling die heeltyd verander. Dit word ook beweer dat die tegnieke tot voordeel sal wees in gevalle waar dit nie moontlik is om vooraf te bepaal watter van die individuele beheerbeleide om te gebruik nie, omdat dit onmoontlik is om te voorspel watter toevoerverdeling gerealiseer gaan word. Dit word verwag dat die tegnieke goeie (maar ongelukkig nie noodwendig die beste nie) resultate saliewer vir enige toevoerverdeling, terwyl 'n individuele beheerbeleid wat moontlik die beste resultate vir die een toevoerverdeling sal gee, katastrofies kan wees vir 'n ander toevoerverdeling. Drie belangrike gevolgtrekkings kan gemaak word van die projek. Eerstens, dit is moontlik om die intydse-komprornie-analise-funksie te outomatiseer. Tweedens, 'n emuleerder kan ontwikkel word om die tegnieke vir die intydse-komprornie-analise te evalueer. Die grootste voordeel van die Emuleerder is dat dit heelwat vinniger as reële tyd kan opereer, wat dit moontlik maak om genoeg data te genereer om die betekenisvolle statistiese vergelykings te maak wat benodig word om die tegnieke te evalueer. Die laaste gevolgtrekking is dat, alhoewel die aanvanklike evaluasie nie beslissend is nie, dit gewys kan word dat die tegnieke verdere studie verdien. Drie belangrike aanbevelings kan gemaak word vanuit die projek. Eerstens, die tegnieke moet nog verder bestudeer word, omdat daar nie beweer kan word dat hulle perfek is of dat hulle die enigste tegnieke is wat kan werk nie. Om die waarheid te sê, hulle is slegs tegnieke wat moontlik kan werk en ander tegnieke kan steeds bewys word om beter te wees. Tweedens sou dit onsinnig wees om te beweer dat die Emuleerder volledig is, en moet die Emuleerder nog verbeter word. Die mees kritiese vereiste is om die Emuleerder te ontwikkel in 'n programmeringstaal en simulasiepakket wat gelyktydige operasies en moeitelose inisiëring toelaat. Dit sal die Emuleerder toelaat om baie vinniger en meer buigsaam te wees. Die laaste aanbeveling is dat die tegnieke geëvalueer moet word met ander parameters in ander stelsels van stygende kompleksiteit, wat die hoogtepunt bereik in die evaluasie van die proses van gekoppelde beplanning en beheer met die tegnieke ingesluit in 'n regte-wêreld buigbare vervaardigingstelsel ("flexible manufacturing system"). Slegs dan sal dit moontlik wees om onomwonde te sê of die tegnieke effektief is of nie. Daar word gehoop dat hierdie projek 'n waardevolle boublok sal vorm wat sal bydra om gekoppelde beplanning en beheer 'n uitvoerbare alternatief te maak vir die beheer van komplekse stelsels, omdat dit hulle sal toelaat om beter te reageer op die veranderende omstandighede wat deesdae die norm is.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/52513
This item appears in the following collections: