Package repos come from untrusted sources, in terms of the buildserver. They
should be handled in VMs and containers as much as possible to avoid
vulnerabilities. As far as I could tell, `fdroid update` only has a single
place where it executes any VCS system: if there is .fdroid.yml present in
a package repo, then it will fetch the commit ID using git.
For better security properties, this implements a simple function to just
read the files to get that commit ID. The function that executes git to do
the same thing is relabeled "unsafe". That is used for status JSON
everywhere, but that runs on fdroiddata.git and fdroidserver.git, which are
trusted repos.
The unsafe version is also used in places where git.Repo() is needed for
other things.
This is a key piece of the ongoing `PUBLISH` _config.yml_ migration. There was uneven implementation of which YAML parser to use, and that could lead to bugs where one parser might read a value one way, and a different parser will read the value a different way. I wanted to be sure that YAML 1.2 would always work.
This makes all code that handles config files use the same `ruamel.yaml` parsers. This only touches other usages of YAML parsers when there is overlap. This does not port all of _fdroidserver_ to `ruamel.yaml` and YAML 1.2. The metadata files should already be YAML 1.2 anyway.
# Conflicts:
# fdroidserver/lint.py
_builds_to_yaml does not use any features of the metadata.Build class, so
it can operate on plain dicts as well. It also does not need to output
Build instances because those are converted to plain dicts when writing out
to YAML.
The type conversion should all happen in post_parse_yaml_metadata whenever
possible. Also, when `if` blocks end in `return`, it is clearer if no
`elif` or `else` is used.
This should reduce surprises when dealing with filenames in things like
`rm:`. So any float/int/bool value can be used directly, without quoting.
* A plain str/int/float value is interpreted as a list of one string.
* Dictionaries as values throws error.
* A set is treated like a list.
Even for people who know what the special floats not-a-number, infinity,
and negative infinity, they don't necessarily know the YAML 1.2 syntax for
these. I didn't. And I've spent some quality time fighting things with
those values. They are also easy to reliably convert to string values.
If the metadata file contains NoSourceSince:, it is added to the collection
of Anti-Features. When rewriting the .yml file, NoSourceSince should only
be written into the AntiFeatures: collection if there are manual changes,
e.g. the user had provided translations.
I profiled this with timeit and a dict with 1000000000 items, and this is
the time difference:
with_equals: 0.8466835720173549
with_is: 0.8536969239939936
with_old: 1.4458542719949037
I also compared using `==` and `is`, and `==` was slightly faster.
I tried to get this to indent the .yaml files properly so yamllint defaults
work with tests/metadata/dump/*.yaml, but it didn't take for some reason:
yaml.indent(mapping=4, sequence=4, offset=2)
This function is only used in checkupdates, and removing it from the App
class moves the App class one step closer to being a plain dict, which is a
more Pythonic style.
Before this, there were separate post-parse paths for app-fields versus
build-flags. This makes all TYPE_STRING values always go through the same
post-parse code path.
My guess is that this is some kind of vestige of the old code structure,
back when there was .txt and .yml formats. This makes it a normal Python
function: input as arg, return value is the result.
It turns out that the maven: field was originally declared as a TYPE_STRING,
given that it was not given a different type in metadata.py's flagtypes.
The code was confused because it was given a default value of `False` rather
than `None` as the rest of the TYPE_STRING fields have.
This construct in build.py means maven: should always be a string:
if '@' in build.maven:
maven_dir = os.path.join(root_dir, build.maven.split('@', 1)[1])
else:
maven_dir = root_dir
The paths in the config must be strings because they are used in things
like env vars where they must be strings. Plus lots of other places in the
code assumes they are strings. This is the first step to defining the
border of where paths can be pathlib.Path() and where they must be strings.