Since we recently replaced the OSError string messages with simple error
codes, having the uerrno module gets back some user friendly error
messages. The total code size (after removing strings, replacing with
uerrno module) is decreased.
It's configured by MICROPY_PY_UERRNO_ERRORCODE and enabled by default
(since that's the behaviour before this patch).
Without this dict the lookup of errno codes to strings must use the
uerrno module itself.
ftp.c is the only user of this function so making it static in that file
allows it to be inlined. Also, reusing unichar_toupper means we no longer
depend on the C stdlib for toupper, saving about 300 bytes of code space.
This patch introduces the a small framework to track differences between
uPy and CPython. The framework consists of:
- A set of "tests" which test for an individual feature that differs between
uPy and CPy. Each test is like a normal uPy test in the test suite, but
has a special comment at the start with some meta-data: a category (eg
syntax, core language), a human-readable description of the difference, a
cause, and a workaround. Following the meta-data there is a short code
snippet which demonstrates the difference. See tests/cpydiff directory
for the initial set of tests.
- A program (this patch) which runs all the tests (on uPy and CPy) and
generates nicely-formated .rst documenting the differences.
- Integration into the docs build so that everything is automatic, and the
differences appear in a way that is easy for users to read/reference (see
latter commits).
The idea with using this new framework is:
- When a new difference is found it's easy to write a short test for it,
along with a description, and add it to the existing ones. It's also easy
for contributors to submit tests for differences they find.
- When something is no longer different the tool will give an error and
difference can be removed (or promoted to a proper feature test).
These tests are intended to fail, as they provide a programatic record of
differences between uPy and CPython. They also contain a special comment
at the start of the file which has meta-data describing the difference,
including known causes and known workarounds.
Since VS2015 update 2 .db files are used for storing browsing info,
instead of .sdf files. If users don't specify a location for these files
excplicitly they end up in the project directory so ignore them.
It's much more efficient in RAM and code size to do implicit literal string
concatenation in the lexer, as opposed to the compiler.
RAM usage is reduced because the concatenation can be done right away in the
tokeniser by just accumulating the string/bytes literals into the lexer's
vstr. Prior to this patch adjacent strings/bytes would create a parse tree
(one node per string/bytes) and then in the compiler a whole new chunk of
memory was allocated to store the concatenated string, which used more than
double the memory compared to just accumulating in the lexer.
This patch also significantly reduces code size:
bare-arm: -204
minimal: -204
unix x64: -328
stmhal: -208
esp8266: -284
cc3200: -224
Previous to this patch there was an explicit check for errors with line
continuation (where backslash was not immediately followed by a newline).
But this check is not necessary: if there is an error then the remaining
logic of the tokeniser will reject the backslash and correctly produce a
syntax error.
Since the table of keywords is sorted, we can use strcmp to do the search
and stop part way through the search if the comparison is less-than.
Because all tokens that are names are subject to this search, this
optimisation will improve the overall speed of the lexer when processing
a script.
The change also decreases code size by a little bit because we now use
strcmp instead of the custom str_strn_equal function.
Keywords only needs to be searched for if the token is a MP_TOKEN_NAME, so
we can move the seach to the part of the code that does the tokenising for
MP_TOKEN_NAME.
Grammar rules have 2 variants: ones that are attached to a specific
compile function which is called to compile that grammar node, and ones
that don't have a compile function and are instead just inspected to see
what form they take.
In the compiler there is a table of all grammar rules, with each entry
having a pointer to the associated compile function. Those rules with no
compile function have a null pointer. There are 120 such rules, so that's
120 words of essentially wasted code space.
By grouping together the compile vs no-compile rules we can put all the
no-compile rules at the end of the list of rules, and then we don't need
to store the null pointers. We just have a truncated table and it's
guaranteed that when indexing this table we only index the first half,
the half with populated pointers.
This patch implements such a grouping by having a specific macro for the
compile vs no-compile grammar rules (DEF_RULE vs DEF_RULE_NC). It saves
around 460 bytes of code on 32-bit archs.