Custom Scripting Language 101

Custom Scripting Language 101
By Justin Sterling
( Celestialkey )
July 20, 2010 July 22, 2010
Index...
1. What is a scripting language? 2. What makes up a scripting language? 3. Creating Rules and Token Identifiers 4. The Script Reader Header 5. The Script Reader Source 6. Code Listing ...................................................... 3 ...................................................... 5 ...................................................... 6 ...................................................... 8 ...................................................... 12 ...................................................... 20
What is a Scripting Language?

When asked about what a scripting language is, you will often get varying results. Taking the wiki definition, "A scripting language, script language or extension language is a
programming language that allows control of one or more software applications. ",
is good enough for
generalization purposes. This is a more vague version of what you would normally hear. A scripting language can be considered more specifically for our purposes to be a 3rd party language independent of the actual application that modifies the flow of the main application. Script languages come in many different forms. You have the very coherent scripting languages such as Torque script (Torque Engine), Unreal script(Unreal Tournament II + III), LUA (Tibia and other games), JASP (Warcraft III TFT), Game Maker Script (Game Maker 5.4+), and the list continues. These are professionally created languages to help players intereact with the game and create customizable worlds. Other languages such as BrainFuck,
++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++ ..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.
Outputs "Hello World" in BrainFuck

LOLCODE,
HAI CAN HAS STDIO? VISIBLE "HAI WORLD!" KTHXBYE
Outputs "Hai World" in lolcats speech

Chef,
Put cinnamon into 2nd mixing bowl
Pushes a item onto a stack

Befunge,
"dlroW olleH">:v ^,_@
and even more. These languages are esoteric and are sometimes almost impossible to read without even a minor mastery of the language. These languages are FAR
more advanced then anything we will cover since they are amazingly huge and complex. We will keep ours simple enough to finish this ebook quickly, yet sophisticated enough to build a language from scratch and have it be useful.
What makes up a scripting language?

Scripting languages are built from 4 main parts. You first have a symbol table which contains everything related to your language. All the keywords, the symbols, the datatypes, etc. After the symbol table, comes the lexical analyzer which converts a chracter stream (the file loaded with your script) into tokens (keywords, operators, datatypes, etc). From there, the tokens are passed into a parser which collects the tokens passed in and builds a syntex tree (how a operation, keyword is to be used). After building a tree, your new command is passed into a semantic checker that looks for any semantic errors inside the command (e.g. 2 parameters are given as opposed to 3, or your missing a parenthesis). You then have a intermediate code generator which does nothing more then execute the command passed in. You also have a few options to add in such as a bytecode generator to "assemble" the code into a easier read format, or a optimizer to increase the speed of the code. Lets take a example of a final product we will end up with at the end of this eBook. Below is a example of a working script that will properly run in our application.
// Test Script // Celestialkey // For the book! // Http://Www.CelestialCoding.com // The '//' will denote a comment Var name def ; // Create a variable and just place "def" inside it Var realName def ; Set name Celestialkey ; // Set variables to strings Print "Hello from " name "!!!" "\n" ; // Print to the screen with normal // string, formating, and with variables Print "What is your name?" "\n" "|> " ; // Ask the user for their name GetInput name ; // Get input and store it inisde // variable 'name' that we created Set realName name ; // Set variables equal to each other Print "Hello, " realName "!!!" "\n" ; // $ ; // End of script character squence
For now, lets start at the bare bones and check out how to create a lexical analyzer from scratch.
Creating the Rules and Token Identifiers

When writing a lexical analyzer, you need to plan out how you want your language to be structured. You can theoretically create a entire new language similar to C++ or a more esoteric one similar to the above examples. Ours will be a very high level and the language will be interpreted, not compiled. This keeps things simple. Our language will follow the follow rules: 1. All lines will end with a space followed by a semi colon 2. When concacting strings, a space is required between each part 3. Variables must be declared before they can be used 4. The value "$ ;" will designate the end of the script (eos). 5. This language will not support if/then or grouping (brackets) 6. This language will be basic, but can easily be expanded on 7. No whitespace can exist between the final command and the eos. Using a comment line can cheat this though. 8. Comments are designated by '//' and have the following main tokens 1. Print 2. ; 3. Var 4. GetInput 5. Set Pretty simple when you compare it to the other MAIN high level languages out there such as C++ or C. You can easily add your own features to this language though. With those rules now defined, we need to ask ourselves, "How is this going to store everything?" Thats a very good question. I quite literally spent 48 hours coding and recoding different designed to make a flexable language. To be honest, it is very difficult. The best way to create a script language is to convert the script into OP codes and then build a virtual machine to run the custom made assembly inside. That is beyond the scope of this book though, so I had to think of a way to make a interpreter. After some debate over 3 good designs that finally worked. I chose the one with the ability to access variables easily since that is where most work is done
normally when you script anything. The basic gist of the lexical analyzer and parser are below. Remember, this is the flow of logic, not the pseudo code.
<The Lexical Analyzer> Scan each line character by chracter looking for white space or special characters that describe the end of a token or beginning of a special token. After a special character has been found, push the token onto a vector that stores all discovered tokens. Repeat each process until the token '$ ;' is encountered. </The Lexical Analyzer> <The Parser> Load each token from the Discovered tokens list and compare them to the function token list. If a match is found, set it to the current postion which would equate to one of the enumerated functions predefined. After loading the operation/function, assume all remaining tokens to be parameters until encountering a ';'. Upon hitting a ';', The new command is pushed onto a command list stack to be executed later. Repeart each process until the token '$ ;' is encountered. </The Parser> <The Execution> Load one command at a time and select the correct operation to use. Once correct operation is discovered, loop through the remaining parameters list and perform the requested operation by parsing the parameter into a single string. Execute the command. </The Execution>
Pretty straight forward right? The code is a bit messy once we jump into it though. Let's take a look at the header code for our script reader.
The Script Reader Header

#ifndef CELESTIALANALYZER_H_ #define CELESTIALANALYZER_H_ #include <windows.h> #include <iostream> #include <fstream> #include <string> #include <vector> using namespace std; // Functions List enum{ PRINT, ADD, VAR, GETINPUT, SET, SUBTRACT }; // Variable types list enum{ _STRING, _INT, _OPR };
// String // Integer // Operation
struct command{ int Operation; vector<string> ParameterList; }; struct variable{ string name; string value; }; struct varType{ int Type; string sValue; int iValue; }; class CelestialAnalyzer{ private: vector<string> vector<string> vector<command> vector<variable> public: CelestialAnalyzer(); void ReadInTokens(ifstream *f); varType AnalyzeParameter(string, int); void Parse(); int ParseCommand(int); int ExecuteCommands(); }; #endif
m_TokenList; m_DiscoveredTokens; m_CommandList; m_VariableList;
// Will store all tokens we compare for // Will store all found tokens // Will store parsed commands form token list // Will store variables created in the script
Let us break this into chunks and review them one by one. Skipping the headers and the defines, we come to this part.:
// Functions List enum{ PRINT, END_LINE, VAR, GETINPUT, SET };
This is the function* list. It will have a 1:1 relation to our TokenList. Meaning the data we push onto the token list will have to follow the same order as this. These will be used to assign function 'operations' to each command. Notice we will not be using END_LINE. That is more of a space filler so we maintain the 1:1 relation to our TokenList.
// Variable types list enum{ _STRING, _INT, _OPR };
The variable types will be used when we go to look at parameters. We will analyze each one and then decide if it is a integer, operator, or a string. In our case, since we are keeping this simple, we will go off the basis of a assumption and check the very first character. If it is 0->9, then we will assume the entire parameter as a integer. If it falls into the '+', '-', etc area. We would say it is a operator. I just added this to allow for expansion. We won't actually use it in this book.
struct command{ int vector<string> };
Operation; ParameterList;
Creating a vector of type 'command' is the simplest way I've found to store the commands after parsing completed. In this structure, we have a int which will store the operation to be performed. This will be loaded with the 1:1 relation I explained earlier.
struct variable{ string name; string value; };
All variables are treated as strings to make things simple. If we need to perform math, we can always just convert the character array into a integer using atoi() or
something similar.
struct varType{ int string int };
Type; sValue; iValue;
varType is used when we analyze parameters. The results are returned in either string or integer format. Sometimes the conversion will simply fill both of them out and let us the programmer select which one to use.
class CelestialAnalyzer{ private: vector<string> vector<string> vector<command> vector<variable> public:
CelestialAnalyzer(); void ReadInTokens(ifstream *f); varType AnalyzeParameter(string, int); void Parse(); int ParseCommand(int); int ExecuteCommands(); };
This is our main class that we instantiate and use. In the private area of the class, we create 4 vectors. The first one stores all the function tokens we look for such as Print, Set, GetInput, etc. The second vector stores ALL tokens found in the text file. Anything that has a white space inside it is broken apart unless certain rules are found such as the opening quotation. When found, white spaces are ignored during the read until another quotation is found. Next we have a command vector that stores all the commands parsed by the parser. Last we have a variable list that stores all found variables. We use this to access and modify variables as well. We then enter into the public area and first thing on the list is the constructor. We push all our function tokens onto the TokenList inside the constructor. We then ReadInTokens(). The function is obvious by the name, but notice it accepts a ifstream as the parameter. This means you have to open the file outside of the parser and then pass the file pointer in. This function will call Parse() after breaking up the file into many tokens. During Parse() the function will loop through all Tokens inside m_DiscoveredTokens and group them into commands and parameters. The basic
syntax rule for this is that the command will always be the first token following a ';' token. Parse() will call ParseCommand() until a -1 is returned (-1 means next token is the end or a '$'). ParseCommand will then call ExecuteCommands() after it finishes parsing all commands together. ExecuteCommands() will then call AnalyzeParameter() repeatedly for all parameters each function has. After the analysis is complete, the program will simply run it.
Okay! Are you guys still with me at this point? If not, be sure to direct your questions to the forums [ http://www.CelestialCoding.com ]. Now that we have the main header for our script reader completed. Let's go to the actual source code itself.
The Script Reader Source

This code will take up multiple pages, so I am going to break it up one function at a time. I will go in chronological order (first function called to last function called)
CelestialAnalyzer::CelestialAnalyzer(){ m_TokenList.push_back("Print"); m_TokenList.push_back(";"); m_TokenList.push_back("Var"); m_TokenList.push_back("GetInput"); m_TokenList.push_back("Set"); } // Print
Here is our constructor where we push everything onto the vector. Remember to push them on in the same order as your enumerated Function/Operations list.
void CelestialAnalyzer::ReadInTokens(std::ifstream *f){ char c; string token; bool readFile = true; do{ f->get(c); switch(c){ case '{': case '}': token.clear(); token.push_back(c); m_DiscoveredTokens.push_back(token); token.clear(); break; case '/':{ // Comments char buf[255]; f->get(c); if(c=='/') f->getline(buf, 255, '\n'); break; } case '\"':{ // "Used for entire strings" token.clear(); bool readString = true; while(readString){ f->get(c); switch(c){ case '\"': case '\n': case '\r': case '^Z': if('\"') m_DiscoveredTokens.push_back(token); token.clear(); readString = false; break; default: token.push_back(c); break; } } } case '\n': case '\r': case ' ': case '^Z': case '\t': m_DiscoveredTokens.push_back(token); token.clear(); break; default: token.push_back(c);
break; } if(f->eof()) readFile = false; } while(readFile); f->close(); cout << "Found " << m_DiscoveredTokens.size() << " tokens.\n"; Parse(); }
At the start, we declare a few variables we will use throughout the function. After the declarations, we start a loop until the end of the file has been reached. At the start of each loop, we collect a character from the file stream and run it through a lot of comparisons using a switch statement. You might notice that we support the tokens { and }. This was just a added example to show you how to group objects. When we compare for strings, we check for the quotation mark. If it is found, we loop and get the characters until either we hit another quotation mark, or the end of a line is reached. This is forgiving for if someone forgets to place a at the end. However, if a semicolon is never found, it will cause undesired results. When checking for comments, we check if the first character is a '/', if it is, we grab the next character. If the next character is also a '/', then we just read in the rest of the line and discard the left overs. After that, we look for the normal token seperations. Newlines, spaces, tabs, returns, and the end of line character ^Z or ascii 26.
void CelestialAnalyzer::Parse(){ int test = 0; while(test > -1){ test = ParseCommand(test); // Recursion... Somewhat } cout << "Number of commands generated: " << m_CommandList.size() << endl; for(unsigned int i=0; i<m_CommandList.size(); i++){ cout << "\tPerform operation " << m_TokenList[m_CommandList[i].Operation].c_str() << endl; for(unsigned int z=0; z<m_CommandList[i].ParameterList.size(); z++) cout << "\t\tParameter " << z << " \"" << m_CommandList[i].ParameterList[z].c_str() << "\"" << endl; } }
Here, we just call ParseCommand() until it returns a -1; We pass in the current position in the m_DiscoveredToken vector and the function returns the current position. After that, we simply print out the number of commands generated as well as the operation and the parameters for each one.
int CelestialAnalyzer::ParseCommand(int x){ command newCom; newCom.Operation = -1; while(m_DiscoveredTokens[x].compare(";") != 0) { if(newCom.Operation < 0) { for(unsigned int i=0; i<m_TokenList.size(); i++) { if(!m_DiscoveredTokens[x].compare(m_TokenList[i])) newCom.Operation = i; } } else if(m_DiscoveredTokens[x].compare("")) newCom.ParameterList.push_back(m_DiscoveredTokens[x]); x++; } x++; if(newCom.Operation == -1) newCom.Operation = 0; m_CommandList.push_back(newCom); if(m_DiscoveredTokens[x].compare("$")) return x; return -1; }
We parse together the commands here. At the start, we create a new command and set a default operation as -1 to signify that it is empty for the moment. Then we grab the current token in vector position 'x' and compare it to the ending function token ';'. As long as we do not hit that, we check if the operation has been set. If it has not, then the command will be a operation so we check comparisons in the token list and if a comparison was found, we set the operation to it. If the operation was already set, the it must be a parameter. Since our file reader like to read in empty lines now and then, we ensure that we don't push the parameter onto the parameter list if it is empty. After that we increment the position and repeat. Once we find the ';' and break from the loop, we increment once more to pass the ';' for the next call to the function. We do a check for the operation and set it to 0 as a default (Print takes multiple parameters, and it will make the error obvious to the user), then push the command onto the command vector stack. We then check the current position for the end of file token. If found, we return -1, otherwise, we return the current position. Now the fun part.
int CelestialAnalyzer::ExecuteCommands(){ char buf[25]; for(unsigned int c = 0; c<m_CommandList.size(); c++) { switch(m_CommandList[c].Operation){ case VAR:{ string s; varType v; variable var = {"Default", "Default"}; CurrentVariableId++; for(unsigned int i=0; i<m_CommandList[c].ParameterList.size(); i++) { v = AnalyzeParameter(m_CommandList[c].ParameterList[i], m_CommandList[c].Operation); switch(v.Type) { case _INT: // Unknown - To be tested var.value = v.iValue; _itoa_s(v.iValue, buf, 25, 10); var.value = buf; break; case _STRING: // Working 100% { if(!var.name.compare("Default")) { var.name = v.sValue; } else { var.value = v.sValue; } } break; case _OPR: break; } } m_VariableList.push_back(var); } break; case PRINT:{ string s; varType v; bool usedVar = false; for(unsigned int i=0; i<m_CommandList[c].ParameterList.size(); i++) { v = AnalyzeParameter(m_CommandList[c].ParameterList[i], m_CommandList[c].Operation); switch(v.Type){ case _INT: s += v.iValue; break; case _STRING: { usedVar = false; string temp = m_CommandList[c].ParameterList[i]; for(unsigned int z =0; z<m_VariableList.size(); z++) { if(!temp.compare(m_VariableList[z].name)) { usedVar = true; s += m_VariableList[z].value.c_str(); break; } } if(usedVar) break; for(unsigned int z =0; z<temp.size(); z++) { switch(temp[z])
{ case '\\': { if((z+1) < temp.size()) // Avoid buffer overflow { if(temp[z+1] == 'n') { s.push_back('\n'); z++; } else if(temp[z+1] == 't') { s.push_back('\t'); z++; } } break; default: s.push_back(temp[z]); } } } } break; case _OPR: break; } } cout << s; break; } case GETINPUT: { string s; getline(cin, s); for(unsigned int x=0; x<m_VariableList.size(); x++) { if(!m_CommandList[c].ParameterList[0].compare(m_VariableList[x].name)) { m_VariableList[x].value = s; } } break; } case SET: { string s = m_CommandList[c].ParameterList[1]; // The value is the second parameter passed in bool setVarToVar = false; for(unsigned int x=0; x<m_VariableList.size(); x++) { if(!m_CommandList[c].ParameterList[1].compare(m_VariableList[x].name)) { s = m_VariableList[x].value; } } for(unsigned int x=0; x<m_VariableList.size(); x++) { if(!m_CommandList[c].ParameterList[0].compare(m_VariableList[x].name)) { m_VariableList[x].value = s; } } } default:
break; } } cout << "Variables Created: " << m_VariableList.size() << endl; return 0; }
This is where magic happens. For each function, we check the type of operation and go into it. Taking the PRINT function as a example. We create a new string at the start as well as a varType variable. For each parameter we analyze it and set v to it. We then check if the parameter is a variable identifier. If it is, we add the contents of the variable to the string and break out of the following loop. Otherwise, we continue to copy the string over into the complete string. We check each character to see if it is a formatted \n or \t. You can add to this list or keep it however. It is your choice.
varType CelestialAnalyzer::AnalyzeParameter(string p, int Operation){ varType v; switch(Operation){ case VAR: { switch(p[0]) { case 1: case 2: case 3: case 4: case 5: case 6: case 7: case 8: case 9: case 0: v.Type = _INT; v.iValue = atoi(p.c_str()); break; case '+': case '=': v.Type = _OPR; default: v.Type = _STRING; // Unknowns are simply returned as strings v.sValue = p; } } break; case PRINT: { switch(p[0]) { case 1: case 2: case 3: case 4: case 5: case 6: case 7: case 8: case 9: case 0: v.Type = _INT; v.iValue = atoi(p.c_str()); v.sValue = p;
break; case '+': case '=': v.Type = _OPR; default: v.Type = _STRING; v.sValue = p; } } break; case GETINPUT: { break; } default: break; } return v; } // Unknowns are simply returned as strings
This should look pretty obvious. You check the first character of the string. If it is a number, then it is a int. Otherwise, it is a operator or a string. Placing it all together along with our test file. We can expect to get the output below.
Of course! I forgot to give you the test code to run this! Sorry about that, I was excited over the next part of the book! The source code for the test bed is below.
#include "CelestialAnalyzer.h" CelestialAnalyzer ca; int main(){ ifstream scriptFile("script.txt", ios::in); if(!scriptFile){ cout << "Error reading file!\n"; } ca.ReadInTokens(&scriptFile); cout << "Executing commands..." << endl << endl; ca.ExecuteCommands(); cout << endl << endl; return 0; }
That's it!
Code Listing
You will find a complete code listing here without any modifications. Downloading the project files will give you the exact same results. If you want a hard codpy of the source code, head to http://www.CelestialCoding.com. Any questions can also be directed there. [Main.cpp]
#include "CelestialAnalyzer.h" CelestialAnalyzer ca; int main(){ ifstream scriptFile("script.txt", ios::in); if(!scriptFile){ cout << "Error reading file!\n"; } ca.ReadInTokens(&scriptFile); cout << "Executing commands..." << endl << endl; ca.ExecuteCommands(); cout << endl << endl; return 0; }
[CelestialAnalyzer.h]
#ifndef CELESTIALANALYZER_H_ #define CELESTIALANALYZER_H_ #include <windows.h> #include <iostream> #include <fstream> #include <string> #include <vector> using namespace std; // Functions List enum{ PRINT, ADD, VAR, GETINPUT, SET, SUBTRACT }; // Variable types list enum{ _STRING, _INT, _OPR };
struct command{ int vector<string> ParameterList; }; struct variable{ string name; string value; }; struct varType{ int Type; string sValue; int iValue; }; class CelestialAnalyzer{ private: vector<string> vector<string> vector<command> vector<variable> public:
Operation;
CelestialAnalyzer(); void ReadInTokens(ifstream *f); varType AnalyzeParameter(string, int); void Parse(); int ParseCommand(int); int ExecuteCommands(); }; #endif
[CelestialAnalyzer.cpp] This file also contains the tries that I attempted while creating the lexical analyzer portion of this file. I tried both C and C++ ways before finally decided on C++ and going through with analyzing how to do it using C++. I only left in the tries just in case anyone was curious on other methods for reading ina file and checking. In the end though, it appeared that a switch statement worked best out of all other options. Note that some of the structures or variables included with the tries might no longer exist inside the header file. They also might have been modified since writing the trial version. Use them only for curiosity, not for actual parsing or lexical analyzing.
#include "CelestialAnalyzer.h" CelestialAnalyzer::CelestialAnalyzer(){ m_TokenList.push_back("Print"); m_TokenList.push_back(";"); m_TokenList.push_back("Var"); m_TokenList.push_back("GetInput"); m_TokenList.push_back("Set"); CurrentVariableId = 0; } // Print
varType CelestialAnalyzer::AnalyzeParameter(string p, int Operation){ varType v; switch(Operation){ case VAR: { switch(p[0]) { case 1: case 2: case 3: case 4: case 5: case 6: case 7: case 8: case 9: case 0: v.Type = _INT; v.iValue = atoi(p.c_str()); break; case '+': case '=': v.Type = _OPR; default: v.Type = _STRING; // Unknowns are simply returned as strings v.sValue = p; } } break; case PRINT: { switch(p[0]) { case 1: case 2: case 3:
case 4: case 5: case 6: case 7: case 8: case 9: case 0: v.Type = _INT; v.iValue = atoi(p.c_str()); v.sValue = p; break; case '+': case '=': v.Type = _OPR; default: v.Type = _STRING; v.sValue = p; } } break; case GETINPUT: { string tmp; getline(cin, tmp); break; } default: break; } return v; } int CelestialAnalyzer::ExecuteCommands(){ char buf[25]; for(unsigned int c = 0; c<m_CommandList.size(); c++) { switch(m_CommandList[c].Operation){ case VAR:{ string s; varType v; variable var = {"Default", "Default"}; CurrentVariableId++; for(unsigned int i=0; i<m_CommandList[c].ParameterList.size(); i++) { v = AnalyzeParameter(m_CommandList[c].ParameterList[i], m_CommandList[c].Operation); switch(v.Type) { case _INT: // Unknown - To be tested var.value = v.iValue; _itoa_s(v.iValue, buf, 25, 10); var.value = buf; break; case _STRING: // Working 100% { if(!var.name.compare("Default")) { var.name = v.sValue; } else { var.value = v.sValue; } } break; case _OPR: // Unknowns are simply returned as strings
break; } } m_VariableList.push_back(var); } break; case PRINT:{ string s; varType v; bool usedVar = false; for(unsigned int i=0; i<m_CommandList[c].ParameterList.size(); i++) { v = AnalyzeParameter(m_CommandList[c].ParameterList[i], m_CommandList[c].Operation); switch(v.Type){ case _INT: s += v.iValue; break; case _STRING: { usedVar = false; string temp = m_CommandList[c].ParameterList[i]; for(unsigned int z =0; z<m_VariableList.size(); z++) { if(! temp.compare(m_VariableList[z].name)) { usedVar = true; s += (m_VariableList[z].value.c_str()); break; } } if(usedVar) break; for(unsigned int z =0; z<temp.size(); z++) { switch(temp[z]) { case '\\': { if((z+1) < temp.size()) // Avoid buffer overflow { if(temp[z+1] == 'n') { s.push_back('\n'); z++; } else if(temp[z+1] == 't') { s.push_back('\t'); z++; } } break; default:
s.push_back(temp[z]); } } } } break; case _OPR: break; } } cout << s; break; } case GETINPUT: { string s; getline(cin, s); for(unsigned int x=0; x<m_VariableList.size(); x++) { if(! m_CommandList[c].ParameterList[0].compare(m_VariableList[x].name)) { m_VariableList[x].value = s; } } break; } case SET: { string s = m_CommandList[c].ParameterList[1]; // The value is the second parameter passed in bool setVarToVar = false; for(unsigned int x=0; x<m_VariableList.size(); x++) { if(! m_CommandList[c].ParameterList[1].compare(m_VariableList[x].name)) { s = m_VariableList[x].value; } } for(unsigned int x=0; x<m_VariableList.size(); x++) { if(! m_CommandList[c].ParameterList[0].compare(m_VariableList[x].name)) { m_VariableList[x].value = s; } } } default: break; } } cout << "Variables Created: " << m_VariableList.size() << endl; return 0; } int CelestialAnalyzer::ParseCommand(int x){ command newCom; newCom.Operation = -1; while(m_DiscoveredTokens[x].compare(";") != 0) { if(newCom.Operation < 0)
{ for(unsigned int i=0; i<m_TokenList.size(); i++) { if(!m_DiscoveredTokens[x].compare(m_TokenList[i])) newCom.Operation = i; } } else if(m_DiscoveredTokens[x].compare("")) newCom.ParameterList.push_back(m_DiscoveredTokens[x]); x++; } x++; if(newCom.Operation == -1) newCom.Operation = 0; m_CommandList.push_back(newCom); if(m_DiscoveredTokens[x].compare("$")) return x; return -1; } void CelestialAnalyzer::Parse(){ int test = 0; while(test > -1){ test = ParseCommand(test); // Recursion... Somewhat } cout << "Number of commands generated: " << m_CommandList.size() << endl; for(unsigned int i=0; i<m_CommandList.size(); i++){ cout << "\tPerform operation " << m_TokenList[m_CommandList[i].Operation].c_str() << endl; for(unsigned int z=0; z<m_CommandList[i].ParameterList.size(); z++) cout << "\t\tParameter " << z << " \"" << m_CommandList[i].ParameterList[z].c_str() << "\"" << endl; } } void CelestialAnalyzer::ReadInTokens(std::ifstream *f){ char c; string token; bool readFile = true; do{ f->get(c); switch(c){ case '{': case '}': token.clear(); token.push_back(c); m_DiscoveredTokens.push_back(token); token.clear(); break; case '/':{ // Comments char buf[255]; f->get(c); if(c=='/') f->getline(buf, 255, '\n'); break; } case '\"':{ // "Used for entire strings" token.clear(); bool readString = true; while(readString){ f->get(c); switch(c){
case '\"': case '\n': case '\r': case '^Z': if('\"') m_DiscoveredTokens.push_back(token); token.clear(); readString = false; break; default: token.push_back(c); break; } } } case '\n': case '\r': case ' ': case '^Z': case '\t': m_DiscoveredTokens.push_back(token); token.clear(); break; default: token.push_back(c); break; } if(f->eof()) readFile = false; } while(readFile); f->close(); cout << "Found " << m_DiscoveredTokens.size() << " tokens.\n"; Parse(); } /* void CelestialAnalyzer::ReadInTokens(ifstream *f){ char command[256]; while(!f->eof()){ for(unsigned int i=0; i<c.size(); i++){ if(c[i] == ' ' || c[i] == '\n' || c[i] == '^Z' || c[i] == ';') { for(unsigned int i=0; i<m_TokenList.size(); i++) { if(!token.compare(m_TokenList[i])){ m_DiscoveredTokens.push_back(token); token.clear(); } } } else { token.push_back(c[i]); } } Code_List.push_back(f); } f->close(); for(unsigned int i=0; i<Code_List.size(); i++){ }
} void CelestialAnalyzer::ReadInTokens(ifstream *f){ vector<string> tokens; string token; char c; bool eof = false; // End of file bool eos = false; // End of statement bool eot = false; // End of token int curVarId = 0; int curFuncId = -1; do{ f->get(c); // Get a character if(c == ';') eos = true; if(c == '\n' || c == ' ' || c == '^Z' || c == '\r' || c == '\t' || f->eof()) { for(unsigned int i=0; i<m_TokenList.size(); i++) { if(token.compare(m_TokenList[i]) == 0) { // They are equal eot = true; function f; f.FuncType = i; Code_List.push_back(f); token.clear(); } } } } while(!eof); } void CelestialAnalyzer::ReadInTokens(ifstream *f){ vector<string> tokens; string token; char c; bool eof = false; do{ bool foundToken = false; f->get(c); // Read in each character if(c == '\n' || marker) for(unsigned int i=0; i<m_TokenList.size(); i++){ if(!token.compare(m_TokenList[i])){ if(!token.compare(";")){ } foundToken = true; tokens.push_back(token); Code_List.push_back(i); // Directly relates to the enum list token.clear(); } } if(!foundToken){ // Code_List.push_back(-1); } } else { token.push_back(c); } if(f->eof()) eof = true; } while (!eof); // Add it to the token array c == ' ' || c == '^Z' || c == '\r' || c == '\t' || f->eof()){ // 26 ASCII ^Z (end of file
f->close(); // // // for(unsigned int i=0; i<m_varList.size(); i++){ cout << "Found Variables: " << m_varList[i].name.c_str() << endl; } for(unsigned int i=0; i<Code_List.size(); i++){ cout << "Found Token: " << Code_List[i] << " (" << m_TokenList[Code_List[i]].c_str() << ")" << endl; } } */

Custom Scripting Language 101

Uploaded by

Copyright:

Available Formats

Custom Scripting Language 101

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Custom Scripting Language 101

Uploaded by

Copyright:

Available Formats

Custom Scripting Language 101

What is a Scripting Language?

is good enough for

Outputs "Hello World" in BrainFuck

Outputs "Hai World" in lolcats speech

Pushes a item onto a stack

What makes up a scripting language?

Creating the Rules and Token Identifiers

The Script Reader Header

// String // Integer // Operation

m_TokenList; m_DiscoveredTokens; m_CommandList; m_VariableList;

// Functions List enum{ PRINT, END_LINE, VAR, GETINPUT, SET };

// Variable types list enum{ _STRING, _INT, _OPR };

// String // Integer // Operation

struct command{ int vector<string> };

struct varType{ int string int };

Type; sValue; iValue;

class CelestialAnalyzer{ private: vector<string> vector<string> vector<command> vector<variable> public:

m_TokenList; m_DiscoveredTokens; m_CommandList; m_VariableList;

The Script Reader Source

// String // Integer // Operation

m_TokenList; m_DiscoveredTokens; m_CommandList; m_VariableList;

You might also like