Research Repository

A Syntactical Reverse Engineering Approach to Fourth Generation Programming Languages Using Formal Methods

Zohri Yafi, Majd (2022) A Syntactical Reverse Engineering Approach to Fourth Generation Programming Languages Using Formal Methods. PhD thesis, University of Essex.


Download (8MB) | Preview


Fourth-generation programming languages (4GLs) feature rapid development with minimum configuration required by developers. However, 4GLs can suffer from limitations such as high maintenance cost and legacy software practices. Reverse engineering an existing large legacy 4GL system into a currently maintainable programming language can be a cheaper and more effective solution than rewriting from scratch. Tools do not exist so far, for reverse engineering proprietary XML-like and model-driven 4GLs where the full language specification is not in the public domain. This research has developed a novel method of reverse engineering some of the syntax of such 4GLs (with Uniface as an exemplar) derived from a particular system, with a view to providing a reliable method to translate/transpile that system's code and data structures into a modern object-oriented language (such as C\#). The method was also applied, although only to a limited extent, to some other 4GLs, Informix and Apex, to show that it was in principle more broadly applicable. A novel testing method that the syntax had been successfully translated was provided using 'abstract syntax trees'. The novel method took manually crafted grammar rules, together with Encapsulated Document Object Model based data from the source language and then used parsers to produce syntactically valid and equivalent code in the target/output language. This proof of concept research has provided a methodology plus sample code to automate part of the process. The methodology comprised a set of manual or semi-automated steps. Further automation is left for future research. In principle, the author's method could be extended to allow the reverse engineering recovery of the syntax of systems developed in other proprietary 4GLs. This would reduce time and cost for the ongoing maintenance of such systems by enabling their software engineers to work using modern object-oriented languages, methodologies, tools and techniques.

Item Type: Thesis (PhD)
Divisions: Faculty of Science and Health > Computer Science and Electronic Engineering, School of
Depositing User: Majd Zohri Yafi
Date Deposited: 11 Feb 2022 10:50
Last Modified: 11 Feb 2022 10:50

Actions (login required)

View Item View Item