The use of the code analysis library OpenC++: modifications, improvements, error corrections

download The use of the code analysis library OpenC++: modifications, improvements, error corrections

of 38

Transcript of The use of the code analysis library OpenC++: modifications, improvements, error corrections

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    1/38

    The use of the code analysis library

    OpenC++: modifications, improvements,

    error corrections

    Author: Andrey Karpov

    Date: 12.01.2008

    Abstract

    The article may be interesting for developers who use or plan to use OpenC++ library (OpenCxx). The

    author tells about his experience of improving OpenC++ library and modifying the library for solving

    special tasks.

    Introduction

    One may often here in forums that there are a lot of C++ syntax analyzers ("parsers"), and many of

    them are free. Or that one may take YACC, for example, and realize his own analyzer easily. Don't

    believe, it is not so easy [1, 2]. One may understand it especially if one remembers that it is even not

    half a task to parse syntax. It is necessary to realize structures for storing the program tree and semantic

    tables containing information about different objects and their scopes. It is especially important while

    developing specialized applications related to the processing and static analysis of C++ code. It is

    necessary for their realization to save the whole program tree what may be provided by few libraries.

    One of them is open library OpenC++ (OpenCxx) [3] about which we'll speak in this article.

    We'd like toh

    elp developers in mastering OpenC++ library and sh

    are our experience of modernizationand improvement of some defects. The article is a compilation of pieces of advice, each of which is

    devoted to correction of some defect or realization of improvement.

    The article is based on recollections about changes that were carried out in VivaCore library [4] based on

    OpenC++. Of course, only a small part of these changes is discussed here. It is a difficult task to

    remember and describe them all. And, for example, description of addition of C language support into

    OpenC++ library will take too much place. But you can always refer to original texts of VivaCore library

    and get a lot of interesting information.

    It remains to say that OpenC++ library is unfortunately out-of-date now and needs serious improvement

    for supporting the modern C++ language standard. That's why if you are going to realize a modern

    compiler for example, you'd better pay your attention to GCC or to commercial libraries [5, 6]. But

    OpenC++ still remains a good and convenient tool for many developers in the sphere of systems of

    specialized processing and modification of program code. With the use of OpenC++ many interesting

    solutions are developed, for example, execution environment OpenTS [7] for T++ programming

    language (development of Program systems Institution RAS), static code analyzer Viva64 [8] or Synopsis

    tool for preparing documentation on the original code [9].

    The purpose of the article is to show by examples how one can modify and improve OpenC++ library

    code. The article describes 15 library modifications related to error correction or addition of new

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    2/38

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    3/38

    { "__w64", W64 },

    ...

    };

    The next step is to create a class for the new lexeme, which we'll call LeafW64.

    namespace Opencxx

    {

    class LeafW64 : public LeafReserved {

    public:

    LeafW64(Token& t) : LeafReserved(t) {}

    LeafW64(char* str, ptrdiff_t len) :

    LeafReserved(str, len) {}

    ptrdiff_t What() { return W64; }

    };

    }

    To create an object we'll need to modify optIntegralTypeOrClassSpec() function:

    ...

    case UNSIGNED :

    flag = 'U';

    kw = new (GC) LeafUNSIGNED(tk);

    break;

    case W64 : // NEW!

    flag = 'W';

    kw = new (GC) LeafW64(tk);

    break;

    ...

    Pay attention that as far as we've decided to refer "__w64" to data types, we'll need the 'W' symbol for

    coding this type. You may learn more about type coding mechanism in Encoding.cc file.

    Introducing a new type we must remember that we need to modernize such functions as

    Parser::isTypeSpecifier() for example.

    And the last important point is modification of Encoding::MakePtree function:

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    4/38

    Ptree* Encoding::MakePtree(unsigned char*& encoded, Ptree* decl)

    {

    ...

    case 'W' :

    typespec = PtreeUtil::Snoc(typespec, w64_t);

    break;

    ...

    }

    Of course, it is only an example, and adding other lexemes may take much more efforts. A good way to

    add a new lexeme correctly is to take one close to it in sense and then find and examine all the places in

    OpenC++ library where it is used.

    3. Skip of development environment complex key constructions not

    influencing the program processing

    We have already examined the way of skipping single keywords which are senseless for our program but

    impede code parsing. Unfortunately, sometimes it is even more difficult. Let's take for demonstration

    such constructions as __pragma and __noop which you may see in header files of VisualC++:

    __forceinline DWORD HEAP_MAKE_TAG_FLAGS (

    DWORD TagBase, DWORD Tag )

    {

    __pragma(warning(push)) __pragma(warning(disable : 4548)) do

    {__noop(TagBase);} while((0,0) __pragma(warning(pop)) );

    return ((DWORD)((TagBase) + ((Tag)

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    5/38

    ...

    }

    Solution consists in modifying Lex::ReadToken function so that when we come across with DECLSPEC or

    MSPRAGMA lexeme we skip it. And then we skip all the lexemes related to __pragma and __noop

    parameters. For skipping all the unnecessary lexemes we use SkipDeclspecToken() function as it is

    shown further.

    ptrdiff_t Lex::ReadToken(char*& ptr, ptrdiff_t& len)

    {

    ...

    else if(t == DECLSPEC){

    SkipDeclspecToken();

    continue;

    }

    else if(t == MSPRAGMA) { // NEW

    SkipDeclspecToken();

    continue;

    }

    else if(t == MS__NOOP) { //NEW

    SkipDeclspecToken();

    continue;

    }

    ...

    }

    4. Function of full file paths disclosure

    In tasks of analysis of original code a large amount of functionality is related to creation of error

    messages and also to navigation on original files. What is inconvenient is that file names returned bysuch functions as Program::LineNumber() may be presented in different ways. Here are some examples:

    C:\\Program Files\\MSVS 8\\VC\\atlmfc\\include\\afx.h

    .\\drawing.cpp

    c:\\src\\wxwindows-2.4.2\\samples\\drawing\\wx/defs.h

    Boost\\boost-1_33_1\\boost/variant/recursive_variant.hpp

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    6/38

    ..\\FieldEdit2\\Src\\amsEdit.cpp

    ..\\..\\..\\src\\base\\ftbase.c

    The way may be full or relative. Different delimiters may be used. All this makes the use of such ways

    inconvenient for processing or for output in information messages. That's why we offer realization of

    FixFileName() function bringing paths to uniform full way. An auxiliary function GetInputFileDirectory() is

    used to return the path to the catalogue where the processed file is situated.

    const string &GetInputFileDirectory() {

    static string oldInputFileName;

    static string fileDirectory;

    string dir;

    VivaConfiguration &cfg = VivaConfiguration::Instance();

    string inputFileName;

    cfg.GetInputFileName(inputFileName);

    if (oldInputFileName == inputFileName)

    return fileDirectory;

    oldInputFileName = inputFileN ame;

    filesystem::path inputFileNamePath(inputFileName,

    filesystem::native);

    fileDirectory = inputFileNamePath.branch_path().string();

    if (fileDirectory.empty()) {

    TCHAR curDir[MAX_PATH];

    if (GetCurrentDirectory(MAX_PATH, curDir) != 0) {

    fileDirectory = curDir;

    } else {

    assert(false);

    }

    }

    algorithm::replace_all(fileDirectory, "/", " \\");

    to_lower(fileDirectory);

    return fileDirectory;

    }

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    7/38

    typedef map StrStrMap;

    typedef StrStrMap::iterator StrStrMapIt;

    void FixFileName(string &fileName) {

    static StrStrMap FileNamesMap;

    StrStrMapIt it = FileNamesMap.find(fileName);

    if (it != FileNamesMap.end()) {

    fileName = it->second;

    return;

    }

    string oldFileName = fileName;

    algorithm::replace_all(fileName, "/", " \\");

    algorithm::replace_all(fileName, " \\\\", "\\");

    filesystem::path tmpPath(fileName, filesystem::native);

    fileName = tmpPath.string();

    algorithm::replace_all(fileName, "/", " \\");

    to_lower(fileName);

    if (fileName.length() < 2) {

    assert(false);

    FileNamesMap.insert(make_pair(oldFileName, fileName));

    return;

    }

    if (fileName[0] == '.' && fileName[1] != '.') {

    const string &dir = GetInputFileDirectory();

    if (!dir.empty())

    fileName.replace(0, 1, dir);

    FileNamesMap.insert(make_pair(oldFileName, fileName));

    return;

    }

    if (isalpha(fileName[0]) && fileName[1] == ':' ) {

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    8/38

    FileNamesMap.insert(make_pair(oldFileName, fileName));

    return;

    }

    const string &dir = GetI nputFileDirectory();

    if (dir.empty())

    fileName.insert(0, ". \\");

    else {

    fileName.insert(0, " \\");

    fileName.insert(0, dir);

    }

    FileNamesMap.insert(make_pair(oldFileName, fileName));

    }

    5. Getting values of numerical literals

    The function of getting a value of a numerical literal may be useful in systems of building documentation

    on the code. For example, with its help one may see that the argument of "void foo(a = 99)" function is

    99 and use this for some purpose.

    GetLiteralType() function that we offer allows to get the literal type and its value if it is integer.

    GetLiteralType() function is created for getting information needed most often and doesn't support

    rarely used record types. But if you need to support UCNs for example or get values of double type, you

    may expand functionality of the functions given below by yourself.

    ", 5) == 0) { retValue = 0; ; } ; } IsHexLiteral(

    *from, size_t len) { (len < 3) ; (from[0] != '0') ;

    (from[1] != 'x' && from[1] != 'X') ; ; } SimpleType

    GetTypeBySufix( *from, size_t len) { assert(from != NULL); (len

    == 0) ST_INT; assert(!isdigit(*from)); suffix_8 = ;

    suffix_16 = ; suffix_32 = ; suffix_64 = ; suffix_i = ;

    suffix_l = ; suffix_u = ; (len != 0) { --len; c =

    *from++; (c) { '8': suffix_8 = ; ; '1':

    (len == 0 || *from++ != '6') { assert();

    ST_UNKNOWN; } --len; suffix_16 = ; ;'3': (len == 0 || *from++ != '2') { assert();

    ST_UNKNOWN; } --len; suffix_32 = ; ;

    '6': (len == 0 || *from++ != '4') { assert();

    ST_UNKNOWN; } --len; suffix_64 = ; ;

    'I': 'i': suffix_i = ; ; 'U': 'u': suffix_u = ; ;

    'L': 'l': suffix_l = ; ; : assert();

    ST_UNKNOWN; } } assert(suffix_8 + suffix_1 6 + suffix_32 +

    suffix_64

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    9/38

    (suffix_64) { (suffix_u) ST_UINT64; ST_INT64;

    } (suffix_l) { (suffix_u) ST_ULONG;

    ST_LONG; } (suffix_u) ST_UINT; assert(suffix_i);

    ST_INT; } SimpleType GetHexLiteral( *from, size_t len,

    &retValue) { assert(len >= 3); *p = from + 2; (!GetHex(p ,

    len, retValue)) { ST_UNKNOWN; } ptrdiff_t newLen = len - (p -

    from); assert(newLen >= 0 && newLen < (len));

    GetTypeBySufix(p, newLen); } IsOctLiteral( *from, size_t len) {(len < 2) ; (from[0] != '0') ; ; } SimpleType

    GetOctLiteral( *from, size_t len,

    &retValue) { assert(len >= 2); *p = from + 1; (!GetOct(p,

    len, retValue)) { ST_UNKNOWN; } ptrdiff_t newLen = len - (p -

    from); assert(newLen >= 0 && newLen < (len));

    GetTypeBySufix(p, newLen); } SimpleType GetDecLiteral( *from, size_t

    len, &retValue) { assert(len >= 1);

    *limit = from + len; n = 0; (from < limit) { c = *from;

    (c < '0' || c > '9') ; from++; n = n * 10 + (c - '0');

    } ptrdiff_t newLen = limit - from; (newLen == (len))

    ST_UNKNOWN; retValue = n; assert(newLen >= 0 && newLen = '0' && c = 'a' && c = 'A' && c

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    10/38

    unsigned __int64 c, n = 0, overflow = 0;

    int digits_found = 0;

    const char *limit = from + len;

    while (from < limit)

    {

    c = *from;

    if (!isxdigit(c))

    break;

    from++;

    overflow |= n ^ (n > 4);

    n = (n '7')

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    11/38

    break;

    from++;

    overflow |= static_cast(n ^ (n > 3));

    n = (n

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    12/38

    return GetHex(p, len, retValue);

    }

    case '0': case '1': case '2': case '3':

    case '4': case '5': case '6': case '7': {

    const char *p = from + 1;

    return GetOct(p, len, retValue);

    }

    case '\\': case '\'': case '"': case '?':

    break;

    case 'a': c = charconsts[0]; break;

    case 'b': c = charconsts[1]; break;

    case 'f': c = charconsts[3]; break;

    case 'n': c = charconsts[4]; break;

    case 'r': c = charconsts[5]; break;

    case 't': c = charconsts[6]; break;

    case 'v': c = charconsts[7]; break;

    case 'e': case 'E': c = charconsts[2]; break;

    default:

    assert(false);

    return false;

    }

    retValue = c;

    return true;

    }

    //'A', '\t', L'A', '\xFE'

    static bool GetCharLiteral(const char *from,

    size_t len,

    unsigned __int64 &retValue) {

    if (len >= 3) {

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    13/38

    if (from[0] == '\'' && from[len - 1] == '\'') {

    unsigned char c = from[1] ;

    if (c == '\\') {

    verify(GetEscape(from + 2, len - 3, retValue));

    } else {

    retValue = c;

    }

    return true;

    }

    }

    if (len >= 4) {

    if (from[0] == 'L' &&

    from[1] == '\'' &&

    from[len - 1] == '\'') {

    unsigned char c = from[2];

    if (c == '\\') {

    verify(GetEscape(from + 3, len - 4, retValue));

    } else {

    retValue = c;

    }

    return true;

    }

    }

    return false;

    }

    // "string"

    static bool GetStringLiteral(const char *from, size_t len) {

    if (len >= 2) {

    if (from[0] == '"' && from[len - 1] == '"')

    return true;

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    14/38

    }

    if (len >= 3) {

    if (from[0] == 'L' &&

    from[1] == '"' &&

    from[len - 1] == '"')

    return true;

    }

    return false;

    }

    bool IsRealLiteral(const char *from, size_t len) {

    if (len < 2)

    return false;

    bool isReal = false;

    bool digitFound = false;

    for (size_t i = 0; i != len; ++i) {

    unsigned char c = from[i];

    switch(c) {

    case 'x': return false;

    case 'X': return false;

    case 'f': isReal = true; break;

    case 'F': isReal = true; break;

    case '.': isReal = true; break;

    case 'e': isReal = true; break;

    case 'E': isReal = true; break;

    case 'l': break;

    case '-': break;

    case '+': break;

    case 'L': break;

    default:

    if (!isdigit(c))

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    15/38

    return false;

    digitFound = true;

    }

    }

    return isReal && digitFound;

    }

    SimpleType GetRealLiteral(const char *from, size_t len) {

    assert(len > 1);

    unsigned char rc1 = from[len - 1];

    if (is_digit(rc1) || rc1 == '.' ||

    rc1 == 'l' || rc1 == 'L' ||

    rc1 == 'e' || rc1 == 'E')

    return ST_DOUBLE;

    if (rc1 == 'f' || rc1 == 'F')

    return ST_FLOAT;

    assert(false);

    return ST_UNKNOWN;

    }

    bool GetBoolLiteral(const char *from, size_t len,

    unsigned __int64 &retValue) {

    if (len == 4 && strncmp(from, "true", 4) == 0) {

    retValue = 1;

    return true;

    }

    if (len == 5 && strncmp(from, "false", 5) == 0) {

    retValue = 0;

    return true;

    }

    return false;

    }

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    16/38

    bool IsHexLiteral(const char *from, size_t len) {

    if (len < 3)

    return false;

    if (from[0] != '0')

    return false;

    if (from[1] != 'x' && from[1] != 'X')

    return false;

    return true;

    }

    SimpleType GetTypeBySufix(const char *from, size_t len) {

    assert(from != NULL);

    if (len == 0)

    return ST_INT;

    assert(!isdigit(*from));

    bool suffix_8 = false;

    bool suffix_16 = false;

    bool suffix_32 = false;

    bool suffix_64 = false;

    bool suffix_i = false;

    bool suffix_l = false;

    bool suffix_u = false;

    while (len != 0) {

    --len;

    const char c = *from++;

    switch(c) {

    case '8': suffix_8 = true; break;

    case '1':

    if (len == 0 || *from++ != '6') {

    assert(false);

    return ST_UNKNOWN;

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    17/38

    }

    --len;

    suffix_16 = true;

    break;

    case '3':

    if (len == 0 || *from++ != '2') {

    assert(false);

    return ST_UNKNOWN;

    }

    --len;

    suffix_32 = true;

    break;

    case '6':

    if (len == 0 || *from++ != '4') {

    assert(false);

    return ST_UNKNOWN;

    }

    --len;

    suffix_64 = true;

    break;

    case 'I':

    case 'i': suffix_i = true; break;

    case 'U':

    case 'u': suffix_u = true; break;

    case 'L':

    case 'l': suffix_l = true; break;

    default:

    assert(false);

    return ST_UNKNOWN;

    }

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    18/38

    }

    assert(suffix_8 + suffix_16 + suffix_32 + suffix_64

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    19/38

    unsigned __int64 &retValue) {

    assert(len >= 3);

    const char *p = from + 2;

    if (!GetHex(p, len, retValue)) {

    return ST_UNKNOWN;

    }

    ptrdiff_t newLen = len - (p - from);

    assert(newLen >= 0 && newLen < static_cast(len));

    return GetTypeBySufix(p, newLen);

    }

    bool IsOctLiteral(const char *from, size_t len) {

    if (len < 2)

    return false;

    if (from[0] != '0')

    return false;

    return true;

    }

    SimpleType GetOctLiteral(const char *from, size_t len,

    unsigned __int64 &retValue) {

    assert(len >= 2);

    const char *p = from + 1;

    if (!GetOct(p, len, retValue)) {

    return ST_UNKNOWN;

    }

    ptrdiff_t newLen = len - (p - from);

    assert(newLen >= 0 && newLen < static_cast(len));

    return GetTypeBySufix(p, newLen);

    }

    SimpleType GetDecLiteral(const char *from, size_t len,

    unsigned __int64 &retValue) {

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    20/38

    assert(len >= 1);

    const char *limit = from + len;

    unsigned __int64 n = 0;

    while (from < limit) {

    const char c = *from;

    if (c < '0' || c > '9')

    break;

    from++;

    n = n * 10 + (c - '0');

    }

    ptrdiff_t newLen = limit - from;

    if (newLen == static_cast(len))

    return ST_UNKNOWN;

    retValue = n;

    assert(newLen >= 0 && newLen < static_cast(len));

    return GetTypeBySufix(from, newLen);

    }

    SimpleType GetLiteralType(const char *from, size_t len,

    unsigned __int64 &retVa lue) {

    if (from == NULL || len == 0)

    return ST_UNKNOWN;

    retValue = 1;

    if (from == NULL || len == 0)

    return ST_UNKNOWN;

    if (GetCharLiteral(from, len, retValue))

    return ST_LESS_INT;

    if (GetStringLiteral(from, len))

    return ST_POINTER;

    if (GetBoolLiteral(from, len, retValue))

    return ST_LESS_INT;

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    21/38

    if (IsRealLiteral(from, len))

    return GetRealLiteral(from, len);

    if (IsHexLiteral(from, len))

    return GetHexLiteral(from, len, retValue);

    if (IsOctLiteral(from, len ))

    return GetOctLiteral(from, len, retValue);

    return GetDecLiteral(from, len, retValue);

    }

    6. Correction of string literal processing function

    We offer you to modify Lex::ReadStrConst() function as it is shown further. This will allow to correct two

    errors related to processing of separated string literals. The first error occurs while processing strings of

    the following kind:

    const char *name = "Viva \

    Core";

    The second:

    const wchar_t *str = L"begin"L"end".

    The corrected function variant:

    bool Lex::ReadStrConst(size_t top, bool isWcharStr)

    {

    char c;

    for(;;){

    c = file->Get();

    if(c == '\\'){

    c = file->Get();

    // Support: "\"

    if (c == '\r') {

    c = file->Get();

    if (c != '\n')

    return false;

    } else if(c == '\0')

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    22/38

    return false;

    }

    else if(c == '"'){

    size_t pos = file->GetCurPos() + 1;

    ptrdiff_t nline = 0;

    do{

    c = file->Get();

    if(c == '\n')

    ++nline;

    } while(is_blank(c) || c == ' \n');

    if (isWcharStr && c == 'L') {

    //Support: L"123" L"456" L "789".

    c = file->Get();

    if(c == '"')

    /* line_number += nline; */ ;

    else{

    file->Unget();

    return false;

    }

    } else {

    if(c == '"')

    /* line_number += nline; */ ;

    else{

    token_len = ptrdiff_t(pos - top);

    file->Rewind(pos);

    return true;

    }

    }

    }

    else if(c == '\n' || c == '\0')

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    23/38

    return false;

    }

    }

    7. Partial correction of the processing of "bool r = a < 1 || b > (int) 2;"

    type expressions

    There is an error in OpenC++ related to the processing of some expressions which are wrongly taken for

    templates. For example, in a string "bool r = a < 1 || b > (int) 2;" "a" variable will be taken for a template

    name and then a lot of troubles with syntactical analysis will follow... Full correction of this error

    requires great changes and is not realized by now. We offer you a temporary solution excluding the

    major part of errors. Further the functions are given which may be added or modified.

    bool VivaParser::MaybeTypeNameOrClassTemplate(Token &token) {

    if (m_env == NULL) {

    return true;

    }

    const char *ptr = token.GetPtr( );

    ptrdiff_t len = token.GetLen();

    Bind *bind;

    bool isType = m_env->LookupType(ptr, len, bind);

    return isType;

    }

    static bool isOperatorInTemplateArg(ptrdiff_t t) {

    return t == AssignOp || t == EqualOp || t == LogOrOp ||

    t == LogAndOp || t == IncOp || t == RelOp;

    }

    /*

    template.args : ''

    template.args must be followed by '(' or '::'

    */

    bool Parser::isTemplateArgs()

    {

    ptrdiff_t i = 0;

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    24/38

    ptrdiff_t t = lex->LookAhead(i++);

    if(t == '

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    25/38

    return false;

    if (isOperatorInTemplateArg(u) &&

    next == Identifier)

    return false;

    if(u == '')

    --n;

    else if(u == '('){

    ptrdiff_t m = 1;

    while(m > 0){

    ptrdiff_t v = lex->LookAhead(i++);

    if(v == '(')

    ++m;

    else if(v == ')')

    --m;

    else if(v == '\0' || v == ';' || v == '}')

    return false;

    }

    }

    else if(u == '\0' || u == ';' || u == '}')

    return false;

    }

    t = lex->LookAhead(i);

    return bool(t == Scope || t == '(');

    }

    return false;

    }

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    26/38

    8. Improved error correction

    Unfortunately, the error correction mechanism in OpenC++ sometimes causes program crash. Problem

    places in OpenC++ are the code similar to this:

    if(!rDefinition(def)){

    if(!SyntaxError())

    return false;

    SkipTo('}');

    lex->GetToken(cp); // WARNING: c rash in the same case.

    body = PtreeUtil::List(new Leaf(op), 0, new Leaf(cp));

    return true;

    }

    One should pay attention to those places where the processing of errors occurs and correct them the

    way shown by the example of Parser::rLinkageBody() and Parser::SyntaxError() functions. The general

    sense of the corrections is that after an error occurs, at first presence of the next lexeme should be

    checked with the use of CanLookAhead() function instead of immediate extraction of it by using

    GetToken,().

    bool Parser::rLinkageBody(Ptree*& body)

    {

    Token op, cp;

    Ptree* def;

    if(lex->GetToken(op) != '{')

    return false;

    body = 0;

    while(lex->LookAhead(0) != '}'){

    if(!rDefinition(def)){

    if(!SyntaxError())

    return false; // too many errors

    if (lex->CanLookAhead(1)) {

    SkipTo('}');

    lex->GetToken(cp);

    if (!lex->CanLookAhead(0))

    return false;

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    27/38

    } else {

    return false;

    }

    body =

    PtreeUtil::List(new (GC) Leaf(op), 0,

    new (GC) Leaf(cp));

    return true; // error recovery

    }

    body = PtreeUtil::Snoc(body, def);

    }

    lex->GetToken(cp);

    body = new (GC)

    PtreeBrace(new (GC) Leaf(op), body, new (GC) Leaf(cp));

    return true;

    }

    bool Parser::SyntaxError()

    {

    syntaxErrors_ = true;

    Token t, t2;

    if (lex->CanLookAhead(0)) {

    lex->LookAhead(0, t);

    } else {

    lex->LookAhead(-1, t);

    }

    if (lex->CanLookAhead(1)) {

    lex->LookAhead(1, t2);

    } else {

    t2 = t;

    }

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    28/38

    SourceLocation location(GetSourceLocation(*this, t.ptr));

    string token(t2.ptr, t2.len);

    errorLog_.Report(ParseErrorMsg(location, token));

    return true;

    }

    9. Update of rTemplateDecl2 function

    Without going into details we offer you to replace rTemplateDecl2() function with the given variant. This

    will exclude some errors while working with template classes.

    bool Parser::rTemplateDecl2(Ptree*& decl,

    TemplateDeclKind &kind)

    {

    Token tk;

    Ptree *args = 0;

    if(lex->GetToken(tk) != TEMPLATE)

    return false;

    if(lex->LookAhead(0) != '

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    29/38

    decl = PtreeUtil::Snoc(decl, new (GC) Leaf(tk));

    if(!rTempArgList(args))

    return false;

    if(lex->GetToken(tk) != '>')

    return false;

    }

    decl =

    PtreeUtil::Nconc(decl,

    PtreeUtil::List(args, new (GC) Leaf(tk)));

    // ignore nested TEMPLATE

    while (lex->LookAhead(0) == TEMPLATE) {

    lex->GetToken(tk);

    if(lex->LookAhead(0) != '')

    return false;

    }

    if (args == 0)

    // template < > declaration

    kind = tdk_specialization;

    else

    // template < ... > declaration

    kind = tdk_decl;

    return true;

    }

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    30/38

    10. Detection of Ptree position in the program text

    In some cases it is necessary to know in what places of the program text there is the code from which a

    particular Ptree object was built.

    The function given below returns the address of the beginning and the end of memory space with the

    text of the program from which the mentioned Ptree object was created.

    void GetPtreePos(const Ptree *p, const char *&begin,

    const char *&end) {

    if (p == NULL)

    return;

    if (p->IsLeaf()) {

    const char *pos = p->GetLeafPosition();

    if (begin == NULL) {

    begin = pos;

    } else {

    begin = min(begin, pos);

    }

    end = max(end, pos);

    }

    else {

    GetPtreePos(p->Car(), begin, end);

    GetPtreePos(p->Cdr(), begin, end);

    }

    }

    11. Support of const A (a) type definitions

    OpenC++ library doesn't support definition of variables of "const A (a)" type. To correct this defect a part

    of the code should be changed inside Parser::rOtherDeclaration function:

    if(!rDeclarators(decl, type_encode, false))

    return false;

    Instead of it the following code should be used:

    if(!rDeclarators(decl, type_encode, false)) {

    // Support: const A (a);

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    31/38

    Lex::TokenIndex after_rDeclarators = lex ->Save();

    lex->Restore(before_rDeclarators);

    if (lex->CanLookAhead(3) && lex->CanLookAhead(-2)) {

    ptrdiff_t c_2 = lex->LookAhead(-2);

    ptrdiff_t c_1 = lex->LookAhead(-1);

    ptrdiff_t c0 = lex->LookAhead(0);

    ptrdiff_t c1 = lex->LookAhead(1);

    ptrdiff_t c2 = lex->LookAhead(2);

    ptrdiff_t c3 = lex->LookAhead(3);

    if (c_2 == CONST && c_1 == Identifier &&

    c0 == '(' && c1 == Identifier && c2 == ')' &&

    (c3 == ';' || c3 == '='))

    {

    Lex::TokenContainer newEmptyContainer;

    ptrdiff_t pos = before_rDeclarators;

    lex->ReplaceTokens(pos + 2, pos + 3, newEmptyContainer);

    lex->ReplaceTokens(pos + 0, pos + 1, newEmptyContainer);

    lex->Restore(before_rDeclarators - 2);

    bool res = rDeclaration(statement);

    return res;

    }

    }

    }

    In this code some auxiliary functions are used which are not discussed in this article. But you can find

    them in VivaCore library.

    12. Support of definitions in classes of T (min)() { } type functions

    Sometimes while programming one has to use workarounds to reach the desirable result. For example,

    a widely known macro "max" often causes troubles while defining in a class a method of "T max()

    {return m;}" type. In this case one resorts to some tricks and define the method as "T (max)() {return

    m;}". Unfortunately, OpenC++ doesn't understand such definitions inside classes. To correct this defect

    Parser::isConstructorDecl() function should be changed in the following way:

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    32/38

    bool Parser::isConstructorDecl()

    {

    if(lex->LookAhead(0) != '(')

    return false;

    else{

    // Support: T (min)() { }

    if (lex->LookAhead(1) == Identifier &&

    lex->LookAhead(2) == ')' &&

    lex->LookAhead(3) == '(')

    return false;

    ptrdiff_t t = lex->LookAhead(1);

    if(t == '*' || t == '&' || t == '(')

    return false; // declarator

    else if(t == CONST || t == VOLATILE)

    return true; // constructor or declarator

    else if(isPtrToMember(1))

    return false; // declarator (::*)

    else

    return true; // maybe constructor

    }

    }

    13. Processing of constructions "using" and "namespace" inside

    functions

    OpenC++ library doesn't know that inside functions "using" and "namespace" constructions may be

    used. But one can easily correct it by modifying Parser::rStatement() function:

    bool Parser::rStatement(Ptree*& st)

    {

    ...

    case USING :

    return rUsing(st);

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    33/38

    case NAMESPACE :

    if (lex->LookAhead(2) == '=')

    return rNamespaceAlias(st);

    return rExprStatement(st);

    ...

    }

    14. Making "this" a pointer

    As it is known "this" is a pointer. But it's not so in OpenC++. That's why we should correct

    Walker::TypeofThis() function to correct the error of type identification.

    Replace the code

    void Walker::TypeofThis(Ptree*, TypeInfo& t)

    {

    t.Set(env->LookupThis());

    }

    with

    void Walker::TypeofThis(Ptree*, TypeInfo& t)

    {

    t.Set(env->LookupThis());

    t.Reference();

    }

    15. Optimization of LineNumber() function

    We have already mentioned Program::LineNumber() function when saying that it returns file names in

    different formats. Then we offered FixFileName() function to correct this situation. But LineNumber()

    function has one more disadvantage related to its slow working speed. That's why we offer you an

    optimized variant of LineNumber() function.

    /*

    LineNumber() returns the line number of the line

    pointed to by PTR.

    */

    size_t Program::LineNumber(const char* ptr,

    const char*& filename,

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    34/38

    ptrdiff_t& filename_length,

    const char *&beginLinePtr) const

    {

    beginLinePtr = NULL;

    ptrdiff_t n;

    size_t len;

    size_t name;

    ptrdiff_t nline = 0;

    size_t pos = ptr - buf;

    size_t startPos = pos;

    if(pos > size){

    // error?

    assert(false);

    filename = defaultname.c_str();

    filename_length = defaultname.length();

    beginLinePtr = buf;

    return 0;

    }

    ptrdiff_t line_number = -1;

    filename_length = 0;

    while(pos > 0){

    if (pos == oldLineNumberPos) {

    line_number = oldLineNumber + nlin e;

    assert(!oldFileName.empty());

    filename = oldFileName.c_str();

    filename_length = oldFileName.length();

    assert(oldBeginLinePtr != NULL);

    if (beginLinePtr == NULL)

    beginLinePtr = oldBeginLinePtr;

    oldBeginLinePtr = beginLinePtr;

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    35/38

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    36/38

    break;

    }

    }

    if(filename_length == 0){

    filename = defaultname.c_str();

    filename_length = defaultname.length();

    oldFileName = std::string(filename,

    filename_length);

    }

    if (line_number < 0) {

    line_number = nline + 1;

    if (beginLinePtr == NULL)

    beginLinePtr = buf;

    oldBeginLinePtr = beginLinePtr;

    oldLineNumber = line_number;

    oldLineNumberPos = startPos;

    }

    return line_number;

    }

    16. Correction of the error occurring while analyzing "#line" directive

    In some cases Program::ReadLineDirective() function glitches taking irrelevant text for "#line" directive.

    The corrected variant of the function looks as follows:

    ptrdiff_t Program::ReadLineDirective(size_t i,

    ptrdiff_t line_number,

    size_t& filename, size_t& filename_length) const

    {

    char c;

    do{

    c = Ref(++i);

    } while(is_blank(c));

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    37/38

    #if defined(_MSC_VER) || defined(IRIX_CC)

    if(i + 5

  • 8/6/2019 The use of the code analysis library OpenC++: modifications, improvements, error corrections

    38/38

    } while(c != '"');

    if(i > fname_start + 2){

    filename = fname_start;

    filename_length = i - fname_start + 1;

    }

    }

    }

    }

    return line_number;

    }

    ConclusionOf course, this article covers only a small part of possible improvements. But we hope that they will be

    useful for developers while using OpenC++ library and will become examples ofhow one can specialize

    the library for one's own tasks.

    We'd like to remind you once more that the improvements shown in this article and many other

    corrections can be found in VivaCore library's code. VivaCore library may be more convenient for many

    tasks than OpenC++.

    If you have questions or would like to add or comment on something, our Viva64.com [10] team is

    always glad to communicate. We are ready to discuss appearing questions, give recommendations and

    help you to use OpenC++ library or VivaCore library. Write us!

    References

    1. Zuev E.A. The rare occupation. PC Magazine/Russian Edition. N 5(75), 1997.http://www.viva64.com/go.php?url=43.

    2. Margaret A. Ellis, Bjarne Stroustrup. The Annotated C++ ReferenceManual. Addison Wesley,1990.

    3. OpenC++ library. http://www.viva64.com/go.php?url=16.4. Andrey Karpov, Evgeniy Ryzhkov. The essence of the code analysis library VivaCore.

    http://www.viva64.com/art-2-2-449187005.html

    5. Semantic Designs site. http://www.viva64.com/go.php?url=19.6. Interstron Company. http://www.viva64.com/go.php?url=42.7. What is OpenTS? http://www.viva64.com/go.php?url=17.8. Evgeniy Ryzhkov. Viva64: what is it and for whom is it meant?9. http://www.viva64.com/art-1-2-903037923.html10.Synopsis: A Source-code Introspection Tool. http://www.viva64.com/go.php?url=18.11. OOO "Program Verification Systems" site. http://www.viva64.com.