This is the main function from where the parser begins it’s activity. On line number 1354 we see a call to the function PyParser_ASTFromFile which is the function which is called to construct the Abstract Syntax Tree from where the compiler does the job. So let us step into the function PyParser_ASTFromFile.

In the PyParser_ASTFromFile function we see a call to the function PyParser_ParseFileFlagsEx on line number 1499 which returns the root node of the AST. This function is the root function of the python parser.

The PyParser_ParseFileFlagsEx is located in the file Parser/parsetok.c on line number 88. The crux of the logic is in the function parsetok on line 129. The parser goes on a infinite loop scanning for tokens in the program. Set a breakpoint on line number 187 and see how the different tokens of your program are parsed into the parser node. We can also see how the parse tree is constructed in the following lines. I shall leave this as an exercise to the reader.

Next we return back to the function PyParser_ASTFromFile on line number 1507. We see a function PyAST_FromNode which takes the root node of the parse tree and constructs the AST. It emits a structure of the type _mod which is located in file Include/Python-ast.h.

struct _mod {
enum _mod_kind kind;
union {
struct {
asdl_seq *body;
} Module;

struct {
asdl_seq *body;
} Interactive;

struct {
expr_ty body;
} Expression;

struct {
asdl_seq *body;
} Suite;
} v;

It is basically an asdl sequence from which the byte code is emitted into the core python code object which we will see in the coming section.

typedef struct {
int size;
void *elements[1];
} asdl_seq; // Include/asdl.h

Next we return back to the PyRun_FileExFlags function in file pythonrun.c and put a breakpoint on line number 1362 at function call run_mod.

We will look at this function in the coming post.

