circuitpython/py/parse.c
Scott Shawcroft dfb61f01db Merge tag 'v1.8.7'
Support for Xtensa emitter and assembler, and upgraded F4 and F7 STM HAL

This release adds support for the Xtensa architecture as a target for the
native emitter, as well as Xtensa inline assembler.  The int.from_bytes
and int.to_bytes methods now require a second argument (the byte order)
per CPython (only "little" is supported at this time).  The "readall"
method has been removed from all stream classes that used it; "read" with
no arguments should be used instead.  There is now support for importing
packages from compiled .mpy files.  Test coverage is increased to 96%.

The generic I2C driver has improvements: configurable clock stretching
timeout, "stop" argument added to readfrom/writeto methods, "nack"
argument added to readinto, and write[to] now returns num of ACKs
received.  The framebuf module now handles 16-bit depth (generic colour
format) and has hline, vline, rect, line methods.  A new utimeq module is
added for efficient queue ordering defined by modulo time (to be
compatible with time.ticks_xxx functions).  The pyboard.py script has been
modified so that the target board is not reset between scripts or commands
that are given on a single command line.

For the stmhal port the STM Cube HAL has been upgraded: Cube F4 HAL to
v1.13.1 (CMSIS 2.5.1, HAL v1.5.2) and Cube F7 HAL to v1.1.2.  There is a
more robust pyb.I2C implementation (DMA is now disabled by default, can be
enabled via an option), and there is an implementation of machine.I2C with
robust error handling and hardware acceleration on F4 MCUs.  It is now
recommended to use machine.I2C instead of pyb.I2C.  The UART class is now
more robust with better handling of errors/timeouts.  There is also more
accurate VBAT and VREFINT measurements for the ADC.  New boards that are
supported include: NUCLEO_F767ZI, STM32F769DISC and NUCLEO_L476RG.

For the esp8266 port select/poll is now supported for sockets using the
uselect module.  There is support for native and viper emitters, as well
as an inline assembler (with limited iRAM for storage of native functions,
or the option to store code to flash).  There is improved software I2C
with a slight API change: scl/sda pins can be specified as positional only
when "-1" is passed as the first argument to indicate the use of software
I2C.  It is recommended to use keyword arguments for scl/sda.  There is
very early support for over-the-air (OTA) updates using the yaota8266
project.

A detailed list of changes follows.

py core:
- emitnative: fix native import emitter when in viper mode
- remove readall() method, which is equivalent to read() w/o args
- objexcept: allow clearing traceback with 'exc.__traceback__ = None'
- runtime: mp_resume: handle exceptions in Python __next__()
- mkrules.mk: rework find command so it works on OSX
- *.mk: replace uses of 'sed' with $(SED)
- parse: move function to check for const parse node to parse.[ch]
- parse: make mp_parse_node_new_leaf an inline function
- parse: add code to fold logical constants in or/and/not operations
- factor persistent code load/save funcs into persistentcode.[ch]
- factor out persistent-code reader into separate files
- lexer: rewrite mp_lexer_new_from_str_len in terms of mp_reader_mem
- lexer: provide generic mp_lexer_new_from_file based on mp_reader
- lexer: rewrite mp_lexer_new_from_fd in terms of mp_reader
- lexer: make lexer use an mp_reader as its source
- objtype: implement __call__ handling for an instance w/o heap alloc
- factor out common code from assemblers into asmbase.[ch]
- stream: move ad-hoc ioctl constants to stream.h and rename them
- compile: simplify configuration of native emitter
- emit.h: remove long-obsolete declarations for cpython emitter
- move arch-specific assembler macros from emitnative to asmXXX.h
- asmbase: add MP_PLAT_COMMIT_EXEC option for handling exec code
- asmxtensa: add low-level Xtensa assembler
- integrate Xtensa assembler into native emitter
- allow inline-assembler emitter to be generic
- add inline Xtensa assembler
- emitinline: embed entire asm struct instead of a pointer to it
- emitinline: move inline-asm align and data methods to compiler
- emitinline: move common code for end of final pass to compiler
- asm: remove need for dummy_data when doing initial assembler passes
- objint: from_bytes, to_bytes: require byteorder arg, require "little"
- binary: do zero extension when storing a value larger than word size
- builtinimport: support importing packages from compiled .mpy files
- mpz: remove unreachable code in mpn_or_neg functions
- runtime: zero out fs_user_mount array in mp_init
- mpconfig.h: enable MICROPY_PY_SYS_EXIT by default
- add MICROPY_KBD_EXCEPTION config option to provide mp_kbd_exception
- compile: add an extra pass for Xtensa inline assembler
- modbuiltins: remove unreachable code
- objint: rename mp_obj_int_as_float to mp_obj_int_as_float_impl
- emitglue: refactor to remove assert(0), to improve coverage
- lexer: remove unreachable code in string tokeniser
- lexer: remove unnecessary check for EOF in lexer's next_char func
- lexer: permanently disable the mp_lexer_show_token function
- parsenum: simplify and generalise decoding of digit values
- mpz: fix assertion in mpz_set_from_str which checks value of base
- mpprint: add assertion for, and comment about, valid base values
- objint: simplify mp_int_format_size and remove unreachable code
- unicode: comment-out unused function unichar_isprint
- consistently update signatures of .make_new and .call methods
- mkrules.mk: add MPY_CROSS_FLAGS option to pass flags to mpy-cross
- builtinimport: fix bug when importing names from frozen packages

extmod:
- machine_i2c: make the clock stretching timeout configurable
- machine_i2c: raise an error when clock stretching times out
- machine_i2c: release SDA on bus error
- machine_i2c: add a C-level I2C-protocol, refactoring soft I2C
- machine_i2c: add argument to C funcs to control stop generation
- machine_i2c: rewrite i2c.scan in terms of C-level protocol
- machine_i2c: rewrite mem xfer funcs in terms of C-level protocol
- machine_i2c: remove unneeded i2c_write_mem/i2c_read_mem funcs
- machine_i2c: make C-level functions return -errno on I2C error
- machine_i2c: add 'nack' argument to i2c.readinto
- machine_i2c: make i2c.write[to] methods return num of ACKs recvd
- machine_i2c: add 'stop' argument to i2c readfrom/writeto meths
- machine_i2c: remove trivial function wrappers
- machine_i2c: expose soft I2C obj and readfrom/writeto funcs
- machine_i2c: add hook to constructor to call port-specific code
- modurandom: allow to build with float disabled
- modframebuf: make FrameBuffer handle 16bit depth
- modframebuf: add back legacy FrameBuffer1 "class"
- modframebuf: optimise fill and fill_rect methods
- vfs_fat: implement POSIX behaviour of rename, allow to overwrite
- moduselect: use stream helper function instead of ad-hoc code
- moduselect: use configurable EVENT_POLL_HOOK instead of WFI
- modlwip: add ioctl method to socket, with poll implementation
- vfs_fat_file: allow file obj to respond to ioctl flush request
- modbtree: add method to sync the database
- modbtree: rename "sync" method to "flush" for consistency
- modframebuf: add hline, vline, rect and line methods
- machine_spi: provide reusable software SPI class
- modframebuf: make framebuf implement the buffer protocol
- modframebuf: store underlying buffer object to prevent GC free
- modutimeq: copy of current moduheapq with timeq support for refactoring
- modutimeq: refactor into optimized class
- modutimeq: make time_less_than be actually "less than", not less/eq

lib:
- utils/interrupt_char: use core-provided mp_kbd_exception if enabled

drivers:
- display/ssd1306.py: update to use FrameBuffer not FrameBuffer1
- onewire: enable pull up on data pin
- onewire/ds18x20: fix negative temperature calc for DS18B20

tools:
- tinytest-codegen: blacklist recently added uheapq_timeq test (qemu-arm)
- pyboard.py: refactor so target is not reset between scripts/cmd
- mpy-tool.py: add support for OPT_CACHE_MAP_LOOKUP_IN_BYTECODE

tests:
- micropython: add test for import from within viper function
- use read() instead of readall()
- basics: add test for logical constant folding
- micropython: add test for creating traceback without allocation
- micropython: move alloc-less traceback test to separate test file
- extmod: improve ujson coverage
- basics: improve user class coverage
- basics: add test for dict.fromkeys where arg is a generator
- basics: add tests for if-expressions
- basics: change dict_fromkeys test so it doesn't use generators
- basics: enable tests for list slice getting with 3rd arg
- extmod/vfs_fat_fileio: add test for constructor of FileIO type
- extmod/btree1: exercise btree.flush()
- extmod/framebuf1: add basics tests for hline, vline, rect, line
- update for required byteorder arg for int.from_bytes()/to_bytes()
- extmod: improve moductypes test coverage
- extmod: improve modframebuf test coverage
- micropython: get heapalloc_traceback test running on baremetal
- struct*: make skippable
- basics: improve mpz test coverage
- float/builtin_float_round: test round() with second arg
- basics/builtin_dir: add test for dir() of a type
- basics: add test for builtin locals()
- basics/set_pop: improve coverage of set functions
- run-tests: for REPL tests make sure the REPL is exited at the end
- basics: improve test coverage for generators
- import: add a test which uses ... in from-import statement
- add tests to improve coverage of runtime.c
- add tests to improve coverage of objarray.c
- extmod: add test for utimeq module
- basics/lexer: add a test for newline-escaping within a string
- add a coverage test for printing the parse-tree
- utimeq_stable: test for partial stability of utimeq queuing
- heapalloc_inst_call: test for no alloc for simple object calls
- basics: add tests for parsing of ints with base 36
- basics: add tests to improve coverage of binary.c
- micropython: add test for micropython.stack_use() function
- extmod: improve ubinascii.c test coverage
- thread: improve modthread.c test coverage
- cmdline: improve repl.c autocomplete test coverage
- unix: improve runtime_utils.c test coverage
- pyb/uart: update test to match recent change to UART timeout_char
- run-tests: allow to skip set tests
- improve warning.c test coverage
- float: improve formatfloat.c test coverage using Python
- unix: improve formatfloat.c test coverage using C
- unix/extra_coverage: add basic tests to import frozen str and mpy
- types1: split out set type test to set_types
- array: allow to skip test if "array" is unavailable
- unix/extra_coverage: add tests for importing frozen packages

unix port:
- rename define for unix moduselect to MICROPY_PY_USELECT_POSIX
- Makefile: update freedos target for change of USELECT config name
- enable utimeq module
- main: allow to print the parse tree in coverage build
- Makefile: make "coverage_test" target mirror Travis test actions
- moduselect: if file object passed to .register(), return it in .poll()
- Makefile: split long line for coverage target, easier to modify
- enable and add basic frozen str and frozen mpy in coverage build
- Makefile: allow cache-map-lookup optimisation with frozen bytecode

windows port:
- enable READER_POSIX to get access to lexer_new_from_file

stmhal port:
- dma: de-init the DMA peripheral properly before initialising
- i2c: add option to I2C to enable/disable use of DMA transfers
- i2c: reset the I2C peripheral if there was an error on the bus
- rename mp_hal_pin_set_af to _config_alt, to simplify alt config
- upgrade to STM32CubeF4 v1.13.0 - CMSIS/Device 2.5.1
- upgrade to STM32CubeF4 v1.13.0 - HAL v1.5.1
- apply STM32CubeF4 v1.13.1 patch - upgrade HAL driver to v1.5.2
- hal/i2c: reapply HAL commit ea040a4 for f4
- hal/sd: reapply HAL commit 1d7fb82 for f4
- hal: reapply HAL commit 9db719b for f4
- hal/rcc: reapply HAL commit c568a2b for f4
- hal/sd: reapply HAL commit 09de030 for f4
- boards: configure all F4 boards to work with new HAL
- make-stmconst.py: fix regex's to work with current CMSIS
- i2c: handle I2C IRQs
- dma: precalculate register base and bitshift on handle init
- dma: mark DMA sate as READY even if HAL_DMA_Init is skipped
- can: clear FIFO flags in IRQ handler
- i2c: provide custom IRQ handlers
- hal: do not include <stdio.h> in HAL headers
- mphalport.h: use single GPIOx->BSRR register
- make-stmconst.py: add support for files with invalid utf8 bytes
- update HALCOMMITS due to change to hal
- make-stmconst.py: restore Python 2 compatibility
- update HALCOMMITS due to change to hal
- moduselect: move to extmod/ for reuse by other ports
- i2c: use the HAL's I2C IRQ handler for F7 and L4 MCUs
- updates to get F411 MCUs compiling with latest ST HAL
- i2c: remove use of legacy I2C_NOSTRETCH_DISABLED option
- add beginnings of port-specific machine.I2C implementation
- i2c: add support for I2C4 hardware block on F7 MCUs
- i2c: expose the pyb_i2c_obj_t struct and some relevant functions
- machine_i2c: provide HW implementation of I2C peripherals for F4
- add support for flash storage on STM32F415
- add back GPIO_BSRRL and GPIO_BSRRH constants to stm module
- add OpenOCD configuration for STM32L4
- add address parameters to openocd config files
- adc: add "mask" selection parameter to pyb.ADCAll constructor
- adc: provide more accurate measure of VBAT and VREFINT
- adc: make ADCAll.read_core_temp return accurate float value
- adc: add ADCAll.read_vref method, returning "3.3v" value
- adc: add support for F767 MCU
- adc: make channel "16" always map to the temperature sensor
- sdcard: clean/invalidate cache before DMA transfers with SD card
- moduos: implement POSIX behaviour of rename, allow to overwrite
- adc: use constants from new HAL version
- refactor UART configuration to use pin objects
- uart: add support for UART7 and UART8 on F7 MCUs
- uart: add check that UART id is valid for the given board
- cmsis: update STM32F7 CMSIS device include files to V1.1.2
- hal: update ST32CubeF7 HAL files to V1.1.2
- port of f4 hal commit c568a2b to updated f7 hal
- port of f4 hal commit 09de030 to updated f7 hal
- port of f4 hal commit 1d7fb82 to updated f7 hal
- declare and initialise PrescTables for F7 MCUs
- boards/STM32F7DISC: define LSE_STARTUP_TIMEOUT
- hal: update HALCOMMITS due to change in f7 hal files
- refactor to use extmod implementation of software SPI class
- cmsis: add CMSIS file stm32f767xx.h, V1.1.2
- add NUCLEO_F767ZI board, with openocd config for stm32f7
- cmsis: add CMSIS file stm32f769xx.h, V1.1.2
- add STM32F769DISC board files
- move PY_SYS_PLATFORM config from board to general config file
- mpconfigport: add weak-module links for io, collections, random
- rename mp_const_vcp_interrupt to mp_kbd_exception
- usb: always use the mp_kbd_exception object for VCP interrupt
- use core-provided keyboard exception object
- led: properly initialise timer handle to zero before using it
- mphalport.h: explicitly use HAL's GPIO constants for pull modes
- usrsw: use mp_hal_pin_config function instead of HAL_GPIO_Init
- led: use mp_hal_pin_config function instead of HAL_GPIO_Init
- sdcard: use mp_hal_pin_config function instead of HAL_GPIO_Init
- add support for STM32 Nucleo64 L476RG
- uart: provide a custom function to transmit over UART
- uart: increase inter-character timeout by 1ms
- enable utimeq module

cc3200 port:
- tools/smoke.py: change readall() to read()
- pybspi: remove static mode=SPI.MASTER parameter for latest HW API
- mods/pybspi: remove SPI.MASTER constant, it's no longer needed
- update for moduselect moved to extmod/
- re-add support for UART REPL (MICROPY_STDIO_UART setting)
- enable UART REPL by default
- README: (re)add information about accessing REPL on serial
- make: rename "deploy" target to "deploy-ota"
- add targets to erase flash, deploy firmware using cc3200tool
- README: reorganize and update to the current state of affairs
- modwlan: add network.WLAN.print_ver() diagnostic function

esp8266 port:
- enable uselect module
- move websocket_helper.py from scripts to modules for frozen BC
- refactor to use extmod implementation of software SPI class
- mpconfigport_512k: disable framebuf module for 512k build
- enable native emitter for Xtensa arch
- enable inline Xtensa assembler
- add "ota" target to produce firmware binary for use with yaota8266
- use core-provided keyboard exception object
- add "erase" target to Makefile, to erase entire flash
- when doing GC be sure to trace the memory holding native code
- modesp: flash_user_start(): support configuration with yaota8266
- force relinking OTA firmware image if built after normal one
- scripts/inisetup: dump FS starting sector/size on error
- Makefile: produce OTA firmware as firmware-ota.bin
- modesp: make check_fw() work with OTA firmware
- enable utimeq module
- Makefile: put firmware-ota.bin in build/, for consistency
- modules/flashbdev: add RESERVED_SECS before the filesystem
- modules/flashbdev: remove code to patch bootloader flash size
- modules/flashbdev: remove now-unused function set_bl_flash_size
- modules/flashbdev: change RESERVED_SECS to 0

zephyr port:
- add .gitignore to ignore Zephyr's "outdir" directory
- zephyr_getchar: update to Zephyr 1.6 unified kernel API
- switch to Zephyr 1.6 unified kernel API
- support raw REPL
- implement soft reset feature
- main: initialize sys.path and sys.argv
- use core-provided keyboard exception object
- uart_core: access console UART directly instead of printk() hack
- enable slice subscription

docs:
- remove references to readall() and update stream read() docs
- library/index: elaborate on u-modules
- library/machine.I2C: refine definitions of I2C methods
- library/pyb.Accel: add hardware note about pins used by accel
- library/pyb.UART: added clarification about timeouts
- library/pyb.UART: moved writechar doc to sit with other writes
- esp8266/tutorial: update intro to add Getting the firmware section
- library/machine.I2C: fix I2C constructor docs to match impl
- esp8266/tutorial: close socket after reading page content
- esp8266/general: add "Scarcity of runtime resources" section
- library/esp: document esp.set_native_code_location() function
- library/esp: remove para and add further warning about flash
- usocket: clarify that socket timeout raises OSError exception

travis:
- build STM32 F7 and L4 boards under Travis CI
- include persistent bytecode with floats in coverage tests

examples:
- hwapi: button_led: Add GPIO pin read example
- hwapi: add soft_pwm example converted to uasyncio
- http_client: use read() instead of readall()
- hwapi: add uasyncio example of fading 2 LEDs in parallel
- hwapi: add example for machine.time_pulse_us()
- hwapi: add hwconfig for console tracing of LED operations
- accellog.py: change 1: to /sd/, and update comment about FS
- hwapi/hwconfig_console: don't alloc memory in value()
2017-01-10 09:56:09 -08:00

1163 lines
45 KiB
C

/*
* This file is part of the Micro Python project, http://micropython.org/
*
* The MIT License (MIT)
*
* Copyright (c) 2013-2015 Damien P. George
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <unistd.h> // for ssize_t
#include <assert.h>
#include <string.h>
#include "py/nlr.h"
#include "py/lexer.h"
#include "py/parse.h"
#include "py/parsenum.h"
#include "py/runtime0.h"
#include "py/runtime.h"
#include "py/objint.h"
#include "py/builtin.h"
#if MICROPY_ENABLE_COMPILER
#define RULE_ACT_ARG_MASK (0x0f)
#define RULE_ACT_KIND_MASK (0x30)
#define RULE_ACT_ALLOW_IDENT (0x40)
#define RULE_ACT_ADD_BLANK (0x80)
#define RULE_ACT_OR (0x10)
#define RULE_ACT_AND (0x20)
#define RULE_ACT_LIST (0x30)
#define RULE_ARG_KIND_MASK (0xf000)
#define RULE_ARG_ARG_MASK (0x0fff)
#define RULE_ARG_TOK (0x1000)
#define RULE_ARG_RULE (0x2000)
#define RULE_ARG_OPT_RULE (0x3000)
// (un)comment to use rule names; for debugging
//#define USE_RULE_NAME (1)
typedef struct _rule_t {
byte rule_id;
byte act;
#ifdef USE_RULE_NAME
const char *rule_name;
#endif
uint16_t arg[];
} rule_t;
enum {
#define DEF_RULE(rule, comp, kind, ...) RULE_##rule,
#include "py/grammar.h"
#undef DEF_RULE
RULE_maximum_number_of,
RULE_string, // special node for non-interned string
RULE_bytes, // special node for non-interned bytes
RULE_const_object, // special node for a constant, generic Python object
};
#define or(n) (RULE_ACT_OR | n)
#define and(n) (RULE_ACT_AND | n)
#define and_ident(n) (RULE_ACT_AND | n | RULE_ACT_ALLOW_IDENT)
#define and_blank(n) (RULE_ACT_AND | n | RULE_ACT_ADD_BLANK)
#define one_or_more (RULE_ACT_LIST | 2)
#define list (RULE_ACT_LIST | 1)
#define list_with_end (RULE_ACT_LIST | 3)
#define tok(t) (RULE_ARG_TOK | MP_TOKEN_##t)
#define rule(r) (RULE_ARG_RULE | RULE_##r)
#define opt_rule(r) (RULE_ARG_OPT_RULE | RULE_##r)
#ifdef USE_RULE_NAME
#define DEF_RULE(rule, comp, kind, ...) static const rule_t rule_##rule = { RULE_##rule, kind, #rule, { __VA_ARGS__ } };
#else
#define DEF_RULE(rule, comp, kind, ...) static const rule_t rule_##rule = { RULE_##rule, kind, { __VA_ARGS__ } };
#endif
#include "py/grammar.h"
#undef or
#undef and
#undef list
#undef list_with_end
#undef tok
#undef rule
#undef opt_rule
#undef one_or_more
#undef DEF_RULE
STATIC const rule_t *const rules[] = {
#define DEF_RULE(rule, comp, kind, ...) &rule_##rule,
#include "py/grammar.h"
#undef DEF_RULE
};
typedef struct _rule_stack_t {
size_t src_line : 8 * sizeof(size_t) - 8; // maximum bits storing source line number
size_t rule_id : 8; // this must be large enough to fit largest rule number
size_t arg_i; // this dictates the maximum nodes in a "list" of things
} rule_stack_t;
typedef struct _mp_parse_chunk_t {
size_t alloc;
union {
size_t used;
struct _mp_parse_chunk_t *next;
} union_;
byte data[];
} mp_parse_chunk_t;
typedef enum {
PARSE_ERROR_NONE = 0,
PARSE_ERROR_MEMORY,
PARSE_ERROR_CONST,
} parse_error_t;
typedef struct _parser_t {
parse_error_t parse_error;
size_t rule_stack_alloc;
size_t rule_stack_top;
rule_stack_t *rule_stack;
size_t result_stack_alloc;
size_t result_stack_top;
mp_parse_node_t *result_stack;
mp_lexer_t *lexer;
mp_parse_tree_t tree;
mp_parse_chunk_t *cur_chunk;
#if MICROPY_COMP_CONST
mp_map_t consts;
#endif
} parser_t;
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wcast-align"
STATIC void *parser_alloc(parser_t *parser, size_t num_bytes) {
// use a custom memory allocator to store parse nodes sequentially in large chunks
mp_parse_chunk_t *chunk = parser->cur_chunk;
if (chunk != NULL && chunk->union_.used + num_bytes > chunk->alloc) {
// not enough room at end of previously allocated chunk so try to grow
mp_parse_chunk_t *new_data = (mp_parse_chunk_t*)m_renew_maybe(byte, chunk,
sizeof(mp_parse_chunk_t) + chunk->alloc,
sizeof(mp_parse_chunk_t) + chunk->alloc + num_bytes, false);
if (new_data == NULL) {
// could not grow existing memory; shrink it to fit previous
(void)m_renew_maybe(byte, chunk, sizeof(mp_parse_chunk_t) + chunk->alloc,
sizeof(mp_parse_chunk_t) + chunk->union_.used, false);
chunk->alloc = chunk->union_.used;
chunk->union_.next = parser->tree.chunk;
parser->tree.chunk = chunk;
chunk = NULL;
} else {
// could grow existing memory
chunk->alloc += num_bytes;
}
}
if (chunk == NULL) {
// no previous chunk, allocate a new chunk
size_t alloc = MICROPY_ALLOC_PARSE_CHUNK_INIT;
if (alloc < num_bytes) {
alloc = num_bytes;
}
chunk = (mp_parse_chunk_t*)m_new(byte, sizeof(mp_parse_chunk_t) + alloc);
chunk->alloc = alloc;
chunk->union_.used = 0;
parser->cur_chunk = chunk;
}
byte *ret = chunk->data + chunk->union_.used;
chunk->union_.used += num_bytes;
return ret;
}
#pragma GCC diagnostic pop
STATIC void push_rule(parser_t *parser, size_t src_line, const rule_t *rule, size_t arg_i) {
if (parser->parse_error) {
return;
}
if (parser->rule_stack_top >= parser->rule_stack_alloc) {
rule_stack_t *rs = m_renew_maybe(rule_stack_t, parser->rule_stack, parser->rule_stack_alloc, parser->rule_stack_alloc + MICROPY_ALLOC_PARSE_RULE_INC, true);
if (rs == NULL) {
parser->parse_error = PARSE_ERROR_MEMORY;
return;
}
parser->rule_stack = rs;
parser->rule_stack_alloc += MICROPY_ALLOC_PARSE_RULE_INC;
}
rule_stack_t *rs = &parser->rule_stack[parser->rule_stack_top++];
rs->src_line = src_line;
rs->rule_id = rule->rule_id;
rs->arg_i = arg_i;
}
STATIC void push_rule_from_arg(parser_t *parser, size_t arg) {
assert((arg & RULE_ARG_KIND_MASK) == RULE_ARG_RULE || (arg & RULE_ARG_KIND_MASK) == RULE_ARG_OPT_RULE);
size_t rule_id = arg & RULE_ARG_ARG_MASK;
assert(rule_id < RULE_maximum_number_of);
push_rule(parser, parser->lexer->tok_line, rules[rule_id], 0);
}
STATIC void pop_rule(parser_t *parser, const rule_t **rule, size_t *arg_i, size_t *src_line) {
assert(!parser->parse_error);
parser->rule_stack_top -= 1;
*rule = rules[parser->rule_stack[parser->rule_stack_top].rule_id];
*arg_i = parser->rule_stack[parser->rule_stack_top].arg_i;
*src_line = parser->rule_stack[parser->rule_stack_top].src_line;
}
bool mp_parse_node_is_const_false(mp_parse_node_t pn) {
return MP_PARSE_NODE_IS_TOKEN_KIND(pn, MP_TOKEN_KW_FALSE)
|| (MP_PARSE_NODE_IS_SMALL_INT(pn) && MP_PARSE_NODE_LEAF_SMALL_INT(pn) == 0);
}
bool mp_parse_node_is_const_true(mp_parse_node_t pn) {
return MP_PARSE_NODE_IS_TOKEN_KIND(pn, MP_TOKEN_KW_TRUE)
|| (MP_PARSE_NODE_IS_SMALL_INT(pn) && MP_PARSE_NODE_LEAF_SMALL_INT(pn) != 0);
}
bool mp_parse_node_get_int_maybe(mp_parse_node_t pn, mp_obj_t *o) {
if (MP_PARSE_NODE_IS_SMALL_INT(pn)) {
*o = MP_OBJ_NEW_SMALL_INT(MP_PARSE_NODE_LEAF_SMALL_INT(pn));
return true;
} else if (MP_PARSE_NODE_IS_STRUCT_KIND(pn, RULE_const_object)) {
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t*)pn;
#if MICROPY_OBJ_REPR == MICROPY_OBJ_REPR_D
// nodes are 32-bit pointers, but need to extract 64-bit object
*o = (uint64_t)pns->nodes[0] | ((uint64_t)pns->nodes[1] << 32);
#else
*o = (mp_obj_t)pns->nodes[0];
#endif
return MP_OBJ_IS_INT(*o);
} else {
return false;
}
}
int mp_parse_node_extract_list(mp_parse_node_t *pn, size_t pn_kind, mp_parse_node_t **nodes) {
if (MP_PARSE_NODE_IS_NULL(*pn)) {
*nodes = NULL;
return 0;
} else if (MP_PARSE_NODE_IS_LEAF(*pn)) {
*nodes = pn;
return 1;
} else {
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t*)(*pn);
if (MP_PARSE_NODE_STRUCT_KIND(pns) != pn_kind) {
*nodes = pn;
return 1;
} else {
*nodes = pns->nodes;
return MP_PARSE_NODE_STRUCT_NUM_NODES(pns);
}
}
}
#if MICROPY_DEBUG_PRINTERS
void mp_parse_node_print(mp_parse_node_t pn, size_t indent) {
if (MP_PARSE_NODE_IS_STRUCT(pn)) {
printf("[% 4d] ", (int)((mp_parse_node_struct_t*)pn)->source_line);
} else {
printf(" ");
}
for (size_t i = 0; i < indent; i++) {
printf(" ");
}
if (MP_PARSE_NODE_IS_NULL(pn)) {
printf("NULL\n");
} else if (MP_PARSE_NODE_IS_SMALL_INT(pn)) {
mp_int_t arg = MP_PARSE_NODE_LEAF_SMALL_INT(pn);
printf("int(" INT_FMT ")\n", arg);
} else if (MP_PARSE_NODE_IS_LEAF(pn)) {
uintptr_t arg = MP_PARSE_NODE_LEAF_ARG(pn);
switch (MP_PARSE_NODE_LEAF_KIND(pn)) {
case MP_PARSE_NODE_ID: printf("id(%s)\n", qstr_str(arg)); break;
case MP_PARSE_NODE_STRING: printf("str(%s)\n", qstr_str(arg)); break;
case MP_PARSE_NODE_BYTES: printf("bytes(%s)\n", qstr_str(arg)); break;
case MP_PARSE_NODE_TOKEN: printf("tok(%u)\n", (uint)arg); break;
default: assert(0);
}
} else {
// node must be a mp_parse_node_struct_t
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t*)pn;
if (MP_PARSE_NODE_STRUCT_KIND(pns) == RULE_string) {
printf("literal str(%.*s)\n", (int)pns->nodes[1], (char*)pns->nodes[0]);
} else if (MP_PARSE_NODE_STRUCT_KIND(pns) == RULE_bytes) {
printf("literal bytes(%.*s)\n", (int)pns->nodes[1], (char*)pns->nodes[0]);
} else if (MP_PARSE_NODE_STRUCT_KIND(pns) == RULE_const_object) {
#if MICROPY_OBJ_REPR == MICROPY_OBJ_REPR_D
printf("literal const(%016llx)\n", (uint64_t)pns->nodes[0] | ((uint64_t)pns->nodes[1] << 32));
#else
printf("literal const(%p)\n", (mp_obj_t)pns->nodes[0]);
#endif
} else {
size_t n = MP_PARSE_NODE_STRUCT_NUM_NODES(pns);
#ifdef USE_RULE_NAME
printf("%s(%u) (n=%u)\n", rules[MP_PARSE_NODE_STRUCT_KIND(pns)]->rule_name, (uint)MP_PARSE_NODE_STRUCT_KIND(pns), (uint)n);
#else
printf("rule(%u) (n=%u)\n", (uint)MP_PARSE_NODE_STRUCT_KIND(pns), (uint)n);
#endif
for (size_t i = 0; i < n; i++) {
mp_parse_node_print(pns->nodes[i], indent + 2);
}
}
}
}
#endif // MICROPY_DEBUG_PRINTERS
/*
STATIC void result_stack_show(parser_t *parser) {
printf("result stack, most recent first\n");
for (ssize_t i = parser->result_stack_top - 1; i >= 0; i--) {
mp_parse_node_print(parser->result_stack[i], 0);
}
}
*/
STATIC mp_parse_node_t pop_result(parser_t *parser) {
if (parser->parse_error) {
return MP_PARSE_NODE_NULL;
}
assert(parser->result_stack_top > 0);
return parser->result_stack[--parser->result_stack_top];
}
STATIC mp_parse_node_t peek_result(parser_t *parser, size_t pos) {
if (parser->parse_error) {
return MP_PARSE_NODE_NULL;
}
assert(parser->result_stack_top > pos);
return parser->result_stack[parser->result_stack_top - 1 - pos];
}
STATIC void push_result_node(parser_t *parser, mp_parse_node_t pn) {
if (parser->parse_error) {
return;
}
if (parser->result_stack_top >= parser->result_stack_alloc) {
mp_parse_node_t *stack = m_renew_maybe(mp_parse_node_t, parser->result_stack, parser->result_stack_alloc, parser->result_stack_alloc + MICROPY_ALLOC_PARSE_RESULT_INC, true);
if (stack == NULL) {
parser->parse_error = PARSE_ERROR_MEMORY;
return;
}
parser->result_stack = stack;
parser->result_stack_alloc += MICROPY_ALLOC_PARSE_RESULT_INC;
}
parser->result_stack[parser->result_stack_top++] = pn;
}
STATIC mp_parse_node_t make_node_string_bytes(parser_t *parser, size_t src_line, size_t rule_kind, const char *str, size_t len) {
mp_parse_node_struct_t *pn = parser_alloc(parser, sizeof(mp_parse_node_struct_t) + sizeof(mp_parse_node_t) * 2);
if (pn == NULL) {
parser->parse_error = PARSE_ERROR_MEMORY;
return MP_PARSE_NODE_NULL;
}
pn->source_line = src_line;
pn->kind_num_nodes = rule_kind | (2 << 8);
char *p = m_new(char, len);
memcpy(p, str, len);
pn->nodes[0] = (uintptr_t)p;
pn->nodes[1] = len;
return (mp_parse_node_t)pn;
}
STATIC mp_parse_node_t make_node_const_object(parser_t *parser, size_t src_line, mp_obj_t obj) {
mp_parse_node_struct_t *pn = parser_alloc(parser, sizeof(mp_parse_node_struct_t) + sizeof(mp_obj_t));
if (pn == NULL) {
parser->parse_error = PARSE_ERROR_MEMORY;
return MP_PARSE_NODE_NULL;
}
pn->source_line = src_line;
#if MICROPY_OBJ_REPR == MICROPY_OBJ_REPR_D
// nodes are 32-bit pointers, but need to store 64-bit object
pn->kind_num_nodes = RULE_const_object | (2 << 8);
pn->nodes[0] = (uint64_t)obj;
pn->nodes[1] = (uint64_t)obj >> 32;
#else
pn->kind_num_nodes = RULE_const_object | (1 << 8);
pn->nodes[0] = (uintptr_t)obj;
#endif
return (mp_parse_node_t)pn;
}
STATIC void push_result_token(parser_t *parser, const rule_t *rule) {
mp_parse_node_t pn;
mp_lexer_t *lex = parser->lexer;
if (lex->tok_kind == MP_TOKEN_NAME) {
qstr id = qstr_from_strn(lex->vstr.buf, lex->vstr.len);
#if MICROPY_COMP_CONST
// if name is a standalone identifier, look it up in the table of dynamic constants
mp_map_elem_t *elem;
if (rule->rule_id == RULE_atom
&& (elem = mp_map_lookup(&parser->consts, MP_OBJ_NEW_QSTR(id), MP_MAP_LOOKUP)) != NULL) {
pn = mp_parse_node_new_small_int(MP_OBJ_SMALL_INT_VALUE(elem->value));
} else {
pn = mp_parse_node_new_leaf(MP_PARSE_NODE_ID, id);
}
#else
(void)rule;
pn = mp_parse_node_new_leaf(MP_PARSE_NODE_ID, id);
#endif
} else if (lex->tok_kind == MP_TOKEN_INTEGER) {
mp_obj_t o = mp_parse_num_integer(lex->vstr.buf, lex->vstr.len, 0, lex);
if (MP_OBJ_IS_SMALL_INT(o)) {
pn = mp_parse_node_new_small_int(MP_OBJ_SMALL_INT_VALUE(o));
} else {
pn = make_node_const_object(parser, lex->tok_line, o);
}
} else if (lex->tok_kind == MP_TOKEN_FLOAT_OR_IMAG) {
mp_obj_t o = mp_parse_num_decimal(lex->vstr.buf, lex->vstr.len, true, false, lex);
pn = make_node_const_object(parser, lex->tok_line, o);
} else if (lex->tok_kind == MP_TOKEN_STRING || lex->tok_kind == MP_TOKEN_BYTES) {
// Don't automatically intern all strings/bytes. doc strings (which are usually large)
// will be discarded by the compiler, and so we shouldn't intern them.
qstr qst = MP_QSTR_NULL;
if (lex->vstr.len <= MICROPY_ALLOC_PARSE_INTERN_STRING_LEN) {
// intern short strings
qst = qstr_from_strn(lex->vstr.buf, lex->vstr.len);
} else {
// check if this string is already interned
qst = qstr_find_strn(lex->vstr.buf, lex->vstr.len);
}
if (qst != MP_QSTR_NULL) {
// qstr exists, make a leaf node
pn = mp_parse_node_new_leaf(lex->tok_kind == MP_TOKEN_STRING ? MP_PARSE_NODE_STRING : MP_PARSE_NODE_BYTES, qst);
} else {
// not interned, make a node holding a pointer to the string/bytes data
pn = make_node_string_bytes(parser, lex->tok_line, lex->tok_kind == MP_TOKEN_STRING ? RULE_string : RULE_bytes, lex->vstr.buf, lex->vstr.len);
}
} else {
pn = mp_parse_node_new_leaf(MP_PARSE_NODE_TOKEN, lex->tok_kind);
}
push_result_node(parser, pn);
}
#if MICROPY_COMP_MODULE_CONST
STATIC const mp_rom_map_elem_t mp_constants_table[] = {
#if MICROPY_PY_UERRNO
{ MP_ROM_QSTR(MP_QSTR_errno), MP_ROM_PTR(&mp_module_uerrno) },
#endif
#if MICROPY_PY_UCTYPES
{ MP_ROM_QSTR(MP_QSTR_uctypes), MP_ROM_PTR(&mp_module_uctypes) },
#endif
// Extra constants as defined by a port
MICROPY_PORT_CONSTANTS
};
STATIC MP_DEFINE_CONST_MAP(mp_constants_map, mp_constants_table);
#endif
STATIC void push_result_rule(parser_t *parser, size_t src_line, const rule_t *rule, size_t num_args);
#if MICROPY_COMP_CONST_FOLDING
STATIC bool fold_logical_constants(parser_t *parser, const rule_t *rule, size_t *num_args) {
if (rule->rule_id == RULE_or_test
|| rule->rule_id == RULE_and_test) {
// folding for binary logical ops: or and
size_t copy_to = *num_args;
for (size_t i = copy_to; i > 0;) {
mp_parse_node_t pn = peek_result(parser, --i);
parser->result_stack[parser->result_stack_top - copy_to] = pn;
if (i == 0) {
// always need to keep the last value
break;
}
if (rule->rule_id == RULE_or_test) {
if (mp_parse_node_is_const_true(pn)) {
//
break;
} else if (!mp_parse_node_is_const_false(pn)) {
copy_to -= 1;
}
} else {
// RULE_and_test
if (mp_parse_node_is_const_false(pn)) {
break;
} else if (!mp_parse_node_is_const_true(pn)) {
copy_to -= 1;
}
}
}
copy_to -= 1; // copy_to now contains number of args to pop
// pop and discard all the short-circuited expressions
for (size_t i = 0; i < copy_to; ++i) {
pop_result(parser);
}
*num_args -= copy_to;
// we did a complete folding if there's only 1 arg left
return *num_args == 1;
} else if (rule->rule_id == RULE_not_test_2) {
// folding for unary logical op: not
mp_parse_node_t pn = peek_result(parser, 0);
if (mp_parse_node_is_const_false(pn)) {
pn = mp_parse_node_new_leaf(MP_PARSE_NODE_TOKEN, MP_TOKEN_KW_TRUE);
} else if (mp_parse_node_is_const_true(pn)) {
pn = mp_parse_node_new_leaf(MP_PARSE_NODE_TOKEN, MP_TOKEN_KW_FALSE);
} else {
return false;
}
pop_result(parser);
push_result_node(parser, pn);
return true;
}
return false;
}
STATIC bool fold_constants(parser_t *parser, const rule_t *rule, size_t num_args) {
// this code does folding of arbitrary integer expressions, eg 1 + 2 * 3 + 4
// it does not do partial folding, eg 1 + 2 + x -> 3 + x
mp_obj_t arg0;
if (rule->rule_id == RULE_expr
|| rule->rule_id == RULE_xor_expr
|| rule->rule_id == RULE_and_expr) {
// folding for binary ops: | ^ &
mp_parse_node_t pn = peek_result(parser, num_args - 1);
if (!mp_parse_node_get_int_maybe(pn, &arg0)) {
return false;
}
mp_binary_op_t op;
if (rule->rule_id == RULE_expr) {
op = MP_BINARY_OP_OR;
} else if (rule->rule_id == RULE_xor_expr) {
op = MP_BINARY_OP_XOR;
} else {
op = MP_BINARY_OP_AND;
}
for (ssize_t i = num_args - 2; i >= 0; --i) {
pn = peek_result(parser, i);
mp_obj_t arg1;
if (!mp_parse_node_get_int_maybe(pn, &arg1)) {
return false;
}
arg0 = mp_binary_op(op, arg0, arg1);
}
} else if (rule->rule_id == RULE_shift_expr
|| rule->rule_id == RULE_arith_expr
|| rule->rule_id == RULE_term) {
// folding for binary ops: << >> + - * / % //
mp_parse_node_t pn = peek_result(parser, num_args - 1);
if (!mp_parse_node_get_int_maybe(pn, &arg0)) {
return false;
}
for (ssize_t i = num_args - 2; i >= 1; i -= 2) {
pn = peek_result(parser, i - 1);
mp_obj_t arg1;
if (!mp_parse_node_get_int_maybe(pn, &arg1)) {
return false;
}
mp_token_kind_t tok = MP_PARSE_NODE_LEAF_ARG(peek_result(parser, i));
static const uint8_t token_to_op[] = {
MP_BINARY_OP_ADD,
MP_BINARY_OP_SUBTRACT,
MP_BINARY_OP_MULTIPLY,
255,//MP_BINARY_OP_POWER,
255,//MP_BINARY_OP_TRUE_DIVIDE,
MP_BINARY_OP_FLOOR_DIVIDE,
MP_BINARY_OP_MODULO,
255,//MP_BINARY_OP_LESS
MP_BINARY_OP_LSHIFT,
255,//MP_BINARY_OP_MORE
MP_BINARY_OP_RSHIFT,
};
mp_binary_op_t op = token_to_op[tok - MP_TOKEN_OP_PLUS];
if (op == (mp_binary_op_t)255) {
return false;
}
int rhs_sign = mp_obj_int_sign(arg1);
if (op <= MP_BINARY_OP_RSHIFT) {
// << and >> can't have negative rhs
if (rhs_sign < 0) {
return false;
}
} else if (op >= MP_BINARY_OP_FLOOR_DIVIDE) {
// % and // can't have zero rhs
if (rhs_sign == 0) {
return false;
}
}
arg0 = mp_binary_op(op, arg0, arg1);
}
} else if (rule->rule_id == RULE_factor_2) {
// folding for unary ops: + - ~
mp_parse_node_t pn = peek_result(parser, 0);
if (!mp_parse_node_get_int_maybe(pn, &arg0)) {
return false;
}
mp_token_kind_t tok = MP_PARSE_NODE_LEAF_ARG(peek_result(parser, 1));
mp_unary_op_t op;
if (tok == MP_TOKEN_OP_PLUS) {
op = MP_UNARY_OP_POSITIVE;
} else if (tok == MP_TOKEN_OP_MINUS) {
op = MP_UNARY_OP_NEGATIVE;
} else {
assert(tok == MP_TOKEN_OP_TILDE); // should be
op = MP_UNARY_OP_INVERT;
}
arg0 = mp_unary_op(op, arg0);
#if MICROPY_COMP_CONST
} else if (rule->rule_id == RULE_expr_stmt) {
mp_parse_node_t pn1 = peek_result(parser, 0);
if (!MP_PARSE_NODE_IS_NULL(pn1)
&& !(MP_PARSE_NODE_IS_STRUCT_KIND(pn1, RULE_expr_stmt_augassign)
|| MP_PARSE_NODE_IS_STRUCT_KIND(pn1, RULE_expr_stmt_assign_list))) {
// this node is of the form <x> = <y>
mp_parse_node_t pn0 = peek_result(parser, 1);
if (MP_PARSE_NODE_IS_ID(pn0)
&& MP_PARSE_NODE_IS_STRUCT_KIND(pn1, RULE_atom_expr_normal)
&& MP_PARSE_NODE_IS_ID(((mp_parse_node_struct_t*)pn1)->nodes[0])
&& MP_PARSE_NODE_LEAF_ARG(((mp_parse_node_struct_t*)pn1)->nodes[0]) == MP_QSTR_const
&& MP_PARSE_NODE_IS_STRUCT_KIND(((mp_parse_node_struct_t*)pn1)->nodes[1], RULE_trailer_paren)
) {
// code to assign dynamic constants: id = const(value)
// get the id
qstr id = MP_PARSE_NODE_LEAF_ARG(pn0);
// get the value
mp_parse_node_t pn_value = ((mp_parse_node_struct_t*)((mp_parse_node_struct_t*)pn1)->nodes[1])->nodes[0];
if (!MP_PARSE_NODE_IS_SMALL_INT(pn_value)) {
parser->parse_error = PARSE_ERROR_CONST;
return false;
}
mp_int_t value = MP_PARSE_NODE_LEAF_SMALL_INT(pn_value);
// store the value in the table of dynamic constants
mp_map_elem_t *elem = mp_map_lookup(&parser->consts, MP_OBJ_NEW_QSTR(id), MP_MAP_LOOKUP_ADD_IF_NOT_FOUND);
assert(elem->value == MP_OBJ_NULL);
elem->value = MP_OBJ_NEW_SMALL_INT(value);
// If the constant starts with an underscore then treat it as a private
// variable and don't emit any code to store the value to the id.
if (qstr_str(id)[0] == '_') {
pop_result(parser); // pop const(value)
pop_result(parser); // pop id
push_result_rule(parser, 0, rules[RULE_pass_stmt], 0); // replace with "pass"
return true;
}
// replace const(value) with value
pop_result(parser);
push_result_node(parser, pn_value);
// finished folding this assignment, but we still want it to be part of the tree
return false;
}
}
return false;
#endif
#if MICROPY_COMP_MODULE_CONST
} else if (rule->rule_id == RULE_atom_expr_normal) {
mp_parse_node_t pn0 = peek_result(parser, 1);
mp_parse_node_t pn1 = peek_result(parser, 0);
if (!(MP_PARSE_NODE_IS_ID(pn0)
&& MP_PARSE_NODE_IS_STRUCT_KIND(pn1, RULE_trailer_period))) {
return false;
}
// id1.id2
// look it up in constant table, see if it can be replaced with an integer
mp_parse_node_struct_t *pns1 = (mp_parse_node_struct_t*)pn1;
assert(MP_PARSE_NODE_IS_ID(pns1->nodes[0]));
qstr q_base = MP_PARSE_NODE_LEAF_ARG(pn0);
qstr q_attr = MP_PARSE_NODE_LEAF_ARG(pns1->nodes[0]);
mp_map_elem_t *elem = mp_map_lookup((mp_map_t*)&mp_constants_map, MP_OBJ_NEW_QSTR(q_base), MP_MAP_LOOKUP);
if (elem == NULL) {
return false;
}
mp_obj_t dest[2];
mp_load_method_maybe(elem->value, q_attr, dest);
if (!(dest[0] != MP_OBJ_NULL && MP_OBJ_IS_INT(dest[0]) && dest[1] == MP_OBJ_NULL)) {
return false;
}
arg0 = dest[0];
#endif
} else {
return false;
}
// success folding this rule
for (size_t i = num_args; i > 0; i--) {
pop_result(parser);
}
if (MP_OBJ_IS_SMALL_INT(arg0)) {
push_result_node(parser, mp_parse_node_new_small_int(MP_OBJ_SMALL_INT_VALUE(arg0)));
} else {
// TODO reuse memory for parse node struct?
push_result_node(parser, make_node_const_object(parser, 0, arg0));
}
return true;
}
#endif
STATIC void push_result_rule(parser_t *parser, size_t src_line, const rule_t *rule, size_t num_args) {
// optimise away parenthesis around an expression if possible
if (rule->rule_id == RULE_atom_paren) {
// there should be just 1 arg for this rule
mp_parse_node_t pn = peek_result(parser, 0);
if (MP_PARSE_NODE_IS_NULL(pn)) {
// need to keep parenthesis for ()
} else if (MP_PARSE_NODE_IS_STRUCT_KIND(pn, RULE_testlist_comp)) {
// need to keep parenthesis for (a, b, ...)
} else {
// parenthesis around a single expression, so it's just the expression
return;
}
}
#if MICROPY_COMP_CONST_FOLDING
if (fold_logical_constants(parser, rule, &num_args)) {
// we folded this rule so return straight away
return;
}
if (fold_constants(parser, rule, num_args)) {
// we folded this rule so return straight away
return;
}
#endif
mp_parse_node_struct_t *pn = parser_alloc(parser, sizeof(mp_parse_node_struct_t) + sizeof(mp_parse_node_t) * num_args);
if (pn == NULL) {
parser->parse_error = PARSE_ERROR_MEMORY;
return;
}
pn->source_line = src_line;
pn->kind_num_nodes = (rule->rule_id & 0xff) | (num_args << 8);
for (size_t i = num_args; i > 0; i--) {
pn->nodes[i - 1] = pop_result(parser);
}
push_result_node(parser, (mp_parse_node_t)pn);
}
mp_parse_tree_t mp_parse(mp_lexer_t *lex, mp_parse_input_kind_t input_kind) {
// initialise parser and allocate memory for its stacks
parser_t parser;
parser.parse_error = PARSE_ERROR_NONE;
parser.rule_stack_alloc = MICROPY_ALLOC_PARSE_RULE_INIT;
parser.rule_stack_top = 0;
parser.rule_stack = m_new_maybe(rule_stack_t, parser.rule_stack_alloc);
parser.result_stack_alloc = MICROPY_ALLOC_PARSE_RESULT_INIT;
parser.result_stack_top = 0;
parser.result_stack = m_new_maybe(mp_parse_node_t, parser.result_stack_alloc);
parser.lexer = lex;
parser.tree.chunk = NULL;
parser.cur_chunk = NULL;
#if MICROPY_COMP_CONST
mp_map_init(&parser.consts, 0);
#endif
// check if we could allocate the stacks
if (parser.rule_stack == NULL || parser.result_stack == NULL) {
goto memory_error;
}
// work out the top-level rule to use, and push it on the stack
size_t top_level_rule;
switch (input_kind) {
case MP_PARSE_SINGLE_INPUT: top_level_rule = RULE_single_input; break;
case MP_PARSE_EVAL_INPUT: top_level_rule = RULE_eval_input; break;
default: top_level_rule = RULE_file_input;
}
push_rule(&parser, lex->tok_line, rules[top_level_rule], 0);
// parse!
size_t n, i; // state for the current rule
size_t rule_src_line; // source line for the first token matched by the current rule
bool backtrack = false;
const rule_t *rule = NULL;
for (;;) {
next_rule:
if (parser.rule_stack_top == 0 || parser.parse_error) {
break;
}
pop_rule(&parser, &rule, &i, &rule_src_line);
n = rule->act & RULE_ACT_ARG_MASK;
/*
// debugging
printf("depth=%d ", parser.rule_stack_top);
for (int j = 0; j < parser.rule_stack_top; ++j) {
printf(" ");
}
printf("%s n=%d i=%d bt=%d\n", rule->rule_name, n, i, backtrack);
*/
switch (rule->act & RULE_ACT_KIND_MASK) {
case RULE_ACT_OR:
if (i > 0 && !backtrack) {
goto next_rule;
} else {
backtrack = false;
}
for (; i < n; ++i) {
uint16_t kind = rule->arg[i] & RULE_ARG_KIND_MASK;
if (kind == RULE_ARG_TOK) {
if (lex->tok_kind == (rule->arg[i] & RULE_ARG_ARG_MASK)) {
push_result_token(&parser, rule);
mp_lexer_to_next(lex);
goto next_rule;
}
} else {
assert(kind == RULE_ARG_RULE);
if (i + 1 < n) {
push_rule(&parser, rule_src_line, rule, i + 1); // save this or-rule
}
push_rule_from_arg(&parser, rule->arg[i]); // push child of or-rule
goto next_rule;
}
}
backtrack = true;
break;
case RULE_ACT_AND: {
// failed, backtrack if we can, else syntax error
if (backtrack) {
assert(i > 0);
if ((rule->arg[i - 1] & RULE_ARG_KIND_MASK) == RULE_ARG_OPT_RULE) {
// an optional rule that failed, so continue with next arg
push_result_node(&parser, MP_PARSE_NODE_NULL);
backtrack = false;
} else {
// a mandatory rule that failed, so propagate backtrack
if (i > 1) {
// already eaten tokens so can't backtrack
goto syntax_error;
} else {
goto next_rule;
}
}
}
// progress through the rule
for (; i < n; ++i) {
switch (rule->arg[i] & RULE_ARG_KIND_MASK) {
case RULE_ARG_TOK: {
// need to match a token
mp_token_kind_t tok_kind = rule->arg[i] & RULE_ARG_ARG_MASK;
if (lex->tok_kind == tok_kind) {
// matched token
if (tok_kind == MP_TOKEN_NAME) {
push_result_token(&parser, rule);
}
mp_lexer_to_next(lex);
} else {
// failed to match token
if (i > 0) {
// already eaten tokens so can't backtrack
goto syntax_error;
} else {
// this rule failed, so backtrack
backtrack = true;
goto next_rule;
}
}
break;
}
case RULE_ARG_RULE:
case RULE_ARG_OPT_RULE:
rule_and_no_other_choice:
push_rule(&parser, rule_src_line, rule, i + 1); // save this and-rule
push_rule_from_arg(&parser, rule->arg[i]); // push child of and-rule
goto next_rule;
default:
assert(0);
goto rule_and_no_other_choice; // to help flow control analysis
}
}
assert(i == n);
// matched the rule, so now build the corresponding parse_node
#if !MICROPY_ENABLE_DOC_STRING
// this code discards lonely statements, such as doc strings
if (input_kind != MP_PARSE_SINGLE_INPUT && rule->rule_id == RULE_expr_stmt && peek_result(&parser, 0) == MP_PARSE_NODE_NULL) {
mp_parse_node_t p = peek_result(&parser, 1);
if ((MP_PARSE_NODE_IS_LEAF(p) && !MP_PARSE_NODE_IS_ID(p)) || MP_PARSE_NODE_IS_STRUCT_KIND(p, RULE_string)) {
pop_result(&parser); // MP_PARSE_NODE_NULL
mp_parse_node_t pn = pop_result(&parser); // possibly RULE_string
if (MP_PARSE_NODE_IS_STRUCT(pn)) {
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
if (MP_PARSE_NODE_STRUCT_KIND(pns) == RULE_string) {
m_del(char, (char*)pns->nodes[0], (size_t)pns->nodes[1]);
}
}
push_result_rule(&parser, rule_src_line, rules[RULE_pass_stmt], 0);
break;
}
}
#endif
// count number of arguments for the parse node
i = 0;
size_t num_not_nil = 0;
for (size_t x = n; x > 0;) {
--x;
if ((rule->arg[x] & RULE_ARG_KIND_MASK) == RULE_ARG_TOK) {
mp_token_kind_t tok_kind = rule->arg[x] & RULE_ARG_ARG_MASK;
if (tok_kind == MP_TOKEN_NAME) {
// only tokens which were names are pushed to stack
i += 1;
num_not_nil += 1;
}
} else {
// rules are always pushed
if (peek_result(&parser, i) != MP_PARSE_NODE_NULL) {
num_not_nil += 1;
}
i += 1;
}
}
if (num_not_nil == 1 && (rule->act & RULE_ACT_ALLOW_IDENT)) {
// this rule has only 1 argument and should not be emitted
mp_parse_node_t pn = MP_PARSE_NODE_NULL;
for (size_t x = 0; x < i; ++x) {
mp_parse_node_t pn2 = pop_result(&parser);
if (pn2 != MP_PARSE_NODE_NULL) {
pn = pn2;
}
}
push_result_node(&parser, pn);
} else {
// this rule must be emitted
if (rule->act & RULE_ACT_ADD_BLANK) {
// and add an extra blank node at the end (used by the compiler to store data)
push_result_node(&parser, MP_PARSE_NODE_NULL);
i += 1;
}
push_result_rule(&parser, rule_src_line, rule, i);
}
break;
}
case RULE_ACT_LIST: {
// n=2 is: item item*
// n=1 is: item (sep item)*
// n=3 is: item (sep item)* [sep]
bool had_trailing_sep;
if (backtrack) {
list_backtrack:
had_trailing_sep = false;
if (n == 2) {
if (i == 1) {
// fail on item, first time round; propagate backtrack
goto next_rule;
} else {
// fail on item, in later rounds; finish with this rule
backtrack = false;
}
} else {
if (i == 1) {
// fail on item, first time round; propagate backtrack
goto next_rule;
} else if ((i & 1) == 1) {
// fail on item, in later rounds; have eaten tokens so can't backtrack
if (n == 3) {
// list allows trailing separator; finish parsing list
had_trailing_sep = true;
backtrack = false;
} else {
// list doesn't allowing trailing separator; fail
goto syntax_error;
}
} else {
// fail on separator; finish parsing list
backtrack = false;
}
}
} else {
for (;;) {
size_t arg = rule->arg[i & 1 & n];
switch (arg & RULE_ARG_KIND_MASK) {
case RULE_ARG_TOK:
if (lex->tok_kind == (arg & RULE_ARG_ARG_MASK)) {
if (i & 1 & n) {
// separators which are tokens are not pushed to result stack
} else {
push_result_token(&parser, rule);
}
mp_lexer_to_next(lex);
// got element of list, so continue parsing list
i += 1;
} else {
// couldn't get element of list
i += 1;
backtrack = true;
goto list_backtrack;
}
break;
case RULE_ARG_RULE:
rule_list_no_other_choice:
push_rule(&parser, rule_src_line, rule, i + 1); // save this list-rule
push_rule_from_arg(&parser, arg); // push child of list-rule
goto next_rule;
default:
assert(0);
goto rule_list_no_other_choice; // to help flow control analysis
}
}
}
assert(i >= 1);
// compute number of elements in list, result in i
i -= 1;
if ((n & 1) && (rule->arg[1] & RULE_ARG_KIND_MASK) == RULE_ARG_TOK) {
// don't count separators when they are tokens
i = (i + 1) / 2;
}
if (i == 1) {
// list matched single item
if (had_trailing_sep) {
// if there was a trailing separator, make a list of a single item
push_result_rule(&parser, rule_src_line, rule, i);
} else {
// just leave single item on stack (ie don't wrap in a list)
}
} else {
push_result_rule(&parser, rule_src_line, rule, i);
}
break;
}
default:
assert(0);
}
}
#if MICROPY_COMP_CONST
mp_map_deinit(&parser.consts);
#endif
// truncate final chunk and link into chain of chunks
if (parser.cur_chunk != NULL) {
(void)m_renew_maybe(byte, parser.cur_chunk,
sizeof(mp_parse_chunk_t) + parser.cur_chunk->alloc,
sizeof(mp_parse_chunk_t) + parser.cur_chunk->union_.used,
false);
parser.cur_chunk->alloc = parser.cur_chunk->union_.used;
parser.cur_chunk->union_.next = parser.tree.chunk;
parser.tree.chunk = parser.cur_chunk;
}
mp_obj_t exc;
if (parser.parse_error) {
#if MICROPY_COMP_CONST
if (parser.parse_error == PARSE_ERROR_CONST) {
exc = mp_obj_new_exception_msg(&mp_type_SyntaxError,
"constant must be an integer");
} else
#endif
{
assert(parser.parse_error == PARSE_ERROR_MEMORY);
memory_error:
exc = mp_obj_new_exception_msg(&mp_type_MemoryError,
"parser could not allocate enough memory");
}
parser.tree.root = MP_PARSE_NODE_NULL;
} else if (
lex->tok_kind != MP_TOKEN_END // check we are at the end of the token stream
|| parser.result_stack_top == 0 // check that we got a node (can fail on empty input)
) {
syntax_error:
if (lex->tok_kind == MP_TOKEN_INDENT) {
exc = mp_obj_new_exception_msg(&mp_type_IndentationError,
"unexpected indent");
} else if (lex->tok_kind == MP_TOKEN_DEDENT_MISMATCH) {
exc = mp_obj_new_exception_msg(&mp_type_IndentationError,
"unindent does not match any outer indentation level");
} else {
exc = mp_obj_new_exception_msg(&mp_type_SyntaxError,
"invalid syntax");
}
parser.tree.root = MP_PARSE_NODE_NULL;
} else {
// no errors
//result_stack_show(parser);
//printf("rule stack alloc: %d\n", parser.rule_stack_alloc);
//printf("result stack alloc: %d\n", parser.result_stack_alloc);
//printf("number of parse nodes allocated: %d\n", num_parse_nodes_allocated);
// get the root parse node that we created
assert(parser.result_stack_top == 1);
exc = MP_OBJ_NULL;
parser.tree.root = parser.result_stack[0];
}
// free the memory that we don't need anymore
m_del(rule_stack_t, parser.rule_stack, parser.rule_stack_alloc);
m_del(mp_parse_node_t, parser.result_stack, parser.result_stack_alloc);
// we also free the lexer on behalf of the caller (see below)
if (exc != MP_OBJ_NULL) {
// had an error so raise the exception
// add traceback to give info about file name and location
// we don't have a 'block' name, so just pass the NULL qstr to indicate this
mp_obj_exception_add_traceback(exc, lex->source_name, lex->tok_line, MP_QSTR_NULL);
mp_lexer_free(lex);
nlr_raise(exc);
} else {
mp_lexer_free(lex);
return parser.tree;
}
}
void mp_parse_tree_clear(mp_parse_tree_t *tree) {
mp_parse_chunk_t *chunk = tree->chunk;
while (chunk != NULL) {
mp_parse_chunk_t *next = chunk->union_.next;
m_del(byte, chunk, sizeof(mp_parse_chunk_t) + chunk->alloc);
chunk = next;
}
}
#endif // MICROPY_ENABLE_COMPILER