summaryrefslogtreecommitdiff
path: root/lib/libalpm/be_sync.c
AgeCommit message (Collapse)Author
2013-09-04libalpm: introduce a usage level for reposDave Reisner
This defines a level of interest a user has in a repository. These are described by the bitmask flags in the alpm_db_usage_t enum: ALPM_DB_USAGE_SEARCH: repo is valid for searching ALPM_DB_USAGE_INSTALL: repo is valid for installs (e.g. -S pkg) ALPM_DB_USAGE_UPGRADE: repo is valid for sysupgrades ALPM_DB_USAGE_ALL: all of the above are valid Explicitly listing the contents of a repo will always be valid, and the repo will always be refreshed appropriately on sync operations. Signed-off-by: Dave Reisner <dreisner@archlinux.org> Signed-off-by: Allan McRae <allan@archlinux.org>
2013-07-22be_sync: avoid crashing on files in the root of a DBDave Reisner
If a sync DB is malformed and contains entries in the root of the archive, load_pkg_for_entry will leave the 'filename' variable empty, leading to a crash in the ensuing strcmp() calls which determine the DB fragment being examined. While this isn't a read error, this should be reported to the user so that it can be addressed by the creator of the DB. As seen: https://bbs.archlinux.org/viewtopic.php?pid=1297766 Signed-off-by: Dave Reisner <dreisner@archlinux.org>
2013-07-22Add missing headerAllan McRae
Exposed when building with --without-libcurl Signed-off-by: Allan McRae <allan@archlinux.org>
2013-07-05do not check error from close(2)Dave Reisner
On operating systems we support, the behavior is always such that the kernel will do the right thing as far as invalidating the file descriptor, regardless of the eventual return value. Therefore, potentially looping and calling close multiple times is wrong. At best, we call close again on an invalid FD and throw a spurious EBADF error. At worst, we might close an FD which doesn't belong to us when a multi-threaded application opens its own file descriptor between iterations of the loop. Signed-off-by: Dave Reisner <dreisner@archlinux.org> Signed-off-by: Allan McRae <allan@archlinux.org>
2013-06-04validate %FILEPATH% when parsing repo dbsSimon Gomizelj
Currently we make no effort to validate the %FILENAME% field in the repo db. This allows for relative paths to be considered valid. A carefully crafted db entry with a malicious relative path, (e.g. `../../../../etc/passwd`) will cause pacman to to overwrite _any_ file on the target's machine. Add the following validation: - doesn't start with '.' - doesn't contain a '/' - won't overflow PATH_MAX Signed-off-by: Simon Gomizelj <simongmzlj@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2013-02-13Fix registering database with non-gpgme buildsAllan McRae
We record whether the default SigLevel is set in order to add upon it for the *FileSigLevel entries. When using the only valid value of "SigLevel = Never" with non-gpgme builds, we need to ignore the ALPM_SIG_PACKAGE_SET flag when determining if we have a valid value for the database SigLevel. Signed-off-by: Allan McRae <allan@archlinux.org>
2013-02-09Skip reading sync db deltas files if UseDelta is unsetAllan McRae
Signed-off-by: Allan McRae <allan@archlinux.org>
2013-01-29Use libarchive compat header for relevant symbolsDave Reisner
Signed-off-by: Dave Reisner <dreisner@archlinux.org> Signed-off-by: Allan McRae <allan@archlinux.org>
2013-01-29dload: pass back the effective URL to callers of _alpm_downloadDave Reisner
I suspect that eventually we're going to end up returning a pointer to an allocated struct to describe the download result, but that's for another patch when the need arises... Fixes FS#33508. Signed-off-by: Dave Reisner <dreisner@archlinux.org> Signed-off-by: Allan McRae <allan@archlinux.org>
2013-01-03Update copyright year for 2013Allan McRae
Signed-off-by: Allan McRae <allan@archlinux.org>
2012-12-14Do not warn about missing database if being downloadedAllan McRae
When a configured repo database is not already downloaded, a warning message such as "warning: database file for 'testing' does not exist" is printed. Disable this warning when the database is scheduled to be downloaded in the transaction. Signed-off-by: Allan McRae <allan@archlinux.org>
2012-07-10update comments referring to alpm_option_get_syncdbsDave Reisner
This function was renamed alpm_get_syncdbs as part of b488f229d. Signed-off-by: Dave Reisner <dreisner@archlinux.org>
2012-06-25repo-add: add checkdepends informationAllan McRae
Similar to the case for makedepends, it is useful to be able to access this information without parsing a PKGBUILD. Signed-off-by: Allan McRae <allan@archlinux.org>
2012-06-25add line length parameter to _alpm_strip_newlineDave Reisner
If known, callers can pass the line size to this function in order to avoid an strlen call. Otherwise, they simply pass 0 and _alpm_strip_newline will do the call instead. Signed-off-by: Dave Reisner <dreisner@archlinux.org> Signed-off-by: Dan McGee <dan@archlinux.org>
2012-05-20repo-add: add makedepends informationAllan McRae
This is useful for tools that automatically rebuild packages and thus require to generate a build order. These entries are skipped by pacman. Signed-off-by: Allan McRae <allan@archlinux.org> Signed-off-by: Dan McGee <dan@archlinux.org>
2012-04-07Prefix alpm_pkgfrom_t members with ALPM_Allan McRae
Signed-off-by: Allan McRae <allan@archlinux.org> Signed-off-by: Dan McGee <dan@archlinux.org>
2012-03-08Simplify package validation info output for sync packagesAllan McRae
Unify the output for local and sync packages by only printing a list of possible validation types for sync packages. This also has the advantage of not printing the very long sha256 checksum which line wrapped on a standard width terminal. Signed-off-by: Allan McRae <allan@archlinux.org>
2012-03-08Hook new optdepend structures upBenedikt Morbach
No new behaviour introduced, everything should work exactly as before. Dan: refactored to use the single alpm_depend_t structure. Signed-off-by: Benedikt Morbach <benedikt.morbach@googlemail.com> Signed-off-by: Dan McGee <dan@archlinux.org>
2012-02-20Update copyright yearsAllan McRae
Add 2012 to the copyright range for all libalpm and pacman source files. Signed-off-by: Allan McRae <allan@archlinux.org> Signed-off-by: Dan McGee <dan@archlinux.org>
2012-02-13use _alpm_access in a few more places for increased debugging infoFlorian Pritz
pacman -U <pkg> returns a bogus "could not find or read package" if the file is on a fuse file system that doesn't allow root access. Debug output isn't very helpful here either so we should log why the access check failed. The other 2 checks already log something when failing so logging a more specific error won't hurt either. Signed-off-by: Florian Pritz <bluewind@xinu.at> Signed-off-by: Dan McGee <dan@archlinux.org>
2012-01-08Fix segfaults on opening invalid archive filesDan McGee
"invalid" in this case simply means files that may or may not be archives. Discovered via a `pacman -Sc` operation with delta files in the package cache directory, but can be triggered if any file is passed to `pacman -Ql` that isn't an archive, for instance, or if the sync database file is not an archive. Fix it up so we are more careful about calling archive_read_finish() only on archives that are valid and have not already been closed, and teach our archive open function to set the returned archive to NULL if we aren't going to be returning something valid anyway. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-01-07Update sync database size estimation valuesDan McGee
As the comment states, this is more like a dartboard than science. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-01-02pkghash improvements/modificationsDan McGee
This patch changes a variety of small things related to our pkghash implementation with an eye toward performance, especially on native 32-bit systems. * Use `unsigned int` rather than `size_t` for hash sizes. We already return ERANGE for any attempted creation of a hash greater than 1 million elements, so unsigned int is more than large enough for our purposes. Switching to this type allows 32 bit systems to do native math without helper functions from libgcc. * _alpm_pkghash_create() now internally adds extra padding for additional array elements, rather than that being the responsibility of the caller. * #define values are moved into static const values in pkghash.c; a new `stride` value is also extracted (but remains set at 1). * Division and modulus operators are removed from the normal find and add paths if possible. We store the upper limit of the number of elements in the hash so we no longer need to calculate this every element addition. When doing wraparound position calculations, we only apply the modulus operator if the value is greater than the number of buckets. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-01-02Minor optimization on likely sync package lookupDan McGee
We have a name_hash value here, so add a cheap compare of it before falling to the strcmp() call. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-01-02Only compile delta regex onceDan McGee
This reduces the number of regcomp() calls when parsing delta entries in the database from once per entry to once for the entire context handle by storing the compiled regex data on the handle itself. Just as we do with the cURL handle, we initialize it the first time it is needed and free it when releasing the handle. A few other small tweaks to the parsing function also take place, including using the stack to store the transient and short file size string while parsing it. When parsing a sync database with 1378 delta entries, this reduces the time of a `pacman -Sl deltas` operation by 50% from 0.22s to 0.12s. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-12-21include config.h via MakefilesDave Reisner
Ensures that config.h is always ordered correctly (first) in the includes. Also means that new source files get this for free without having to remember to add it. We opt for -imacros over -include as its more portable, and the added constraint by -imacros doesn't bother us for config.h. This also touches the HACKING file to remove the explicit mention of config.h as part of the includes. Signed-off-by: Dave Reisner <dreisner@archlinux.org> Signed-off-by: Dan McGee <dan@archlinux.org>
2011-11-16Add helper method for creating and opening archive objectDan McGee
This moves the common setup code of about 5 different callers into one method. Error messages will now be common and shared in all places; several paths did not have any messages at all before. In addition, we now pick an ideal block size for the archive read based off the larger value of our default buffer size or the st.st_blksize field. For a filesystem such as NFS, this is often much larger than the default 8192- values such as 32768 and 131072 are common. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-11-01Convert package and database archive reads to use file descriptorsDan McGee
This gives us a bit more control and over the archive reading process, and a bit less is done behind the scenes. It also allows us to use fstat() in preference to stat(), which should avoid some potential race conditions. Some reorganization is necessary to move the stat calls after the open() calls. Error handling and cleanup in general is also improved, as we had several potential memory and file handle leaks before in some error paths. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-10-27Introduce ALPM_BUFFER_SIZE constantDan McGee
This takes the place of three previously used constants: ARCHIVE_DEFAULT_BYTES_PER_BLOCK, BUFFER_SIZE, and CPBUFSIZE. In libarchive 3.0, the first constant will be no more, so we can ensure we are forward-compatible by removing our usage of it now. The rest are unified for consistency. By default, we will use the value of BUFSIZ provided by <stdio.h>, which is 8192 on Linux. If that is undefined, a default value is provided. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-09-28Refactor download payload reset and freeDan McGee
This was done to squash a memory leak in the sync database download code. When we downloaded a database and then reused the payload struct, we could find ourselves calling get_fullpath() for the signatures and overwriting non-freed values we had left over from the database download. Refactor the payload_free function into a payload_reset function that we can call that does NOT free the payload itself, so we can reuse payload structs. This also allows us to move the payload to the stack in some call paths, relieving us of the need to alloc space. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-09-27be_sync: fetch only entry mode, not full stat structDan McGee
This saves a lot of unnecessary work since we don't need any of the other fields in the stat struct. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-09-22Ensure database validity flags are set in invalid signature caseDan McGee
We returned the right error code but never set the flags accordingly. Also, now that we can bail early, ensure we set the error code. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-09-20signing: add a process and retry loop for database signaturesDan McGee
In reality, there is no retrying that happens as of now because we don't have any import or changing of the keyring going on, but the code is set up so we can drop this in our new _alpm_process_siglist() function. Wire up the basics to the sync database validation code, so we see something like the following: $ pacman -Ss unknowntrust error: core: signature from "Dan McGee <dpmcgee@gmail.com>" is unknown trust error: core: signature from "Dan McGee <dpmcgee@gmail.com>" is unknown trust error: database 'core' is not valid (invalid or corrupted database (PGP signature)) $ pacman -Ss missingsig error: core: missing required signature error: core: missing required signature error: database 'core' is not valid (invalid or corrupted database (PGP signature)) Yes, there is some double output, but this should be fixable in the future. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-09-20Allow our PGP helper method to pass back the signature resultsDan McGee
This will make its way up the call chain eventually to allow trusting and importing of keys as necessary. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-09-02Clean up handling of size fieldsDan McGee
We currently have csize, isize, and size concepts, and sometimes the difference isn't clear. Ensure the following holds: * size (aka csize): always the compressed size of the package; available for everything except local packages (where it will return 0) * isize: always the installed size of the package; available for all three package types Signed-off-by: Dan McGee <dan@archlinux.org>
2011-08-29Parse > 2GiB file sizes correctlyDan McGee
We were using atol(), which on 32 bit, cannot handle values greater than 2GiB, which is fail. Switch to a strtoull() wrapper function tailored toward parsing off_t values. This allows parsing of very large positive integer values. off_t is a signed type, but in our usages, we never parse or have a need for negative values, so the function will return -1 on error. Before: $ pacman -Si flightgear-data | grep Size Download Size : 2097152.00 K Installed Size : 2097152.00 K After: $ ./src/pacman/pacman -Si flightgear-data | grep Size Download Size : 2312592.52 KiB Installed Size : 5402896.00 KiB Signed-off-by: Dan McGee <dan@archlinux.org>
2011-08-28Database read optimizationsDan McGee
Hard to believe there was still more room to improve on this, but I found an easily correctable oversight tonight. Our databases (both sync and local) contain many blank lines, and we were not moving onto the next line right away in these cases; instead we would proceed through our strcmp() conditional checks as normal. Some local numbers follow to show the effects of this patch: Sync `-Ss foobarbaz`: 71,709 blank lines skipped early ~1,505,889 strcmp() calls avoided (21 per line) ~15% speed improvement (.210 --> .179 sec) Local `-Qs foobarbaz`: 6,823 blank lines skipped early 115,991 strcmp() calls avoided (17 per line) ~6% speed improvement (.080 -> .071 sec) Signed-off-by: Dan McGee <dan@archlinux.org>
2011-08-28be_sync.c: Fix memory leak in alpm_db_update()Lukas Fleischer
Free "syncpath" and restore umask if we fail to grab a lock. Signed-off-by: Lukas Fleischer <archlinux@cryptocrack.de> Signed-off-by: Dan McGee <dan@archlinux.org>
2011-08-25Remove argument from check_pgp_helperDan McGee
This one wasn't all that necessary as we only used it in one place in the function, which can be checked easily enough at the call site. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-08-19Remove alpm_option_get_* usage from backendDan McGee
These are all available directly on the handle without indirection. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-08-19dload: add 'unlink_on_fail' to payload structDave Reisner
Let callers of _alpm_download state whether we should delete on fail, rather than inferring it from context. We still override this decision and always unlink when a temp file is used. Signed-off-by: Dave Reisner <dreisner@archlinux.org> Signed-off-by: Dan McGee <dan@archlinux.org>
2011-08-18Enhance and utilize database status flagsDan McGee
* Move is_local standalone field to status enum * Create VALID/INVALID flag pair * Create EXISTS/MISSING flag pair With these additional fields, we can be more intelligent with database loading and messages to the user. We now only warn once if a sync database does not exist and do not continue to try to load it once we have marked it as missing. The reason for the flags existing in pairs is so the unknown case can be represented. There should never be a time when both flags in the same group are true, but if they are both false, it represents the unknown case. Care is taken to always manipulate both flags at the same time. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-08-15Only check necessary signatures and checksumsDan McGee
The precedence goes as follows: signature > sha256sum > md5sum Add some logic and helper methods to check what we have available when loading a package, and then only check what is necessary to verify the package. This should speed up sync database verifies as we no longer will be doing both a checksum and a signature validation. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-08-15Parse conflicts/provides/replaces at database load timeDan McGee
We did this with depends way back in commit c244cfecf654d3 in 2007. We can do it with these fields as well. Of note is the inclusion of provides even though only '=' is supported- we'll parse other things, but no guarantees are given as to behavior, which is more or less similar to before since we only looked for the equals sign. Also of note is the non-inclusion of optdepends; this will likely be resolved down the road. The biggest benefactors of this change will be the resolving code that formerly had to parse and reparse several of these fields; it only happens once now at load time. This does lead to the disadvantage that we will now always be parsing this information up front even if we never need it in the split form, but as these are uncommon fields and our parser is quite efficient it shouldn't be a big concern. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-08-15Load and allow access to sha256sumDan McGee
This adds a field in the package struct for this checksum type as well as allowing access via the API to it. The frontend is now able to display any read value. Note that this does not implement any use or verification of the value internally. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-08-02Don't trim whitespace when reading database entriesDan McGee
We don't write with extra or unknown whitespace, so there is little reason for us to trim it when reading either. This also fixes the hopefully never encountered "paths that start or end with spaces" issue, for which two pactests have been added. The tests also contain other evil characters that we have encountered before and handle just fine, but it doesn't hurt to ensure we don't break such support in the future. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-07-19Clean up my debug logger messDan McGee
Signed-off-by: Dan McGee <dan@archlinux.org>
2011-07-18Fix test suite when GPGME is disabledDan McGee
As noted by Allan, we failed pretty hard if gpgme was compiled out. With these changes, only sign001.py fails. This can/will be fixed later once we beef up the test suite with more signing tests anyway. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-07-05absorb some _alpm_download params into payload structDave Reisner
Restore some sanity to the number of arguments passed to _alpm_download and curl_download_internal. Signed-off-by: Dave Reisner <dreisner@archlinux.org>
2011-07-05lib/dload: prevent large file attacksDave Reisner
This means creating a new struct which can pass more descriptive data from the back end sync functions to the downloader. In particular, we're interested in the download size read from the sync DB. When the remote server reports a size larger than this (via a content-length header), abort the transfer. In cases where the size is unknown, we set a hard upper limit of: * 25MiB for a sync DB * 16KiB for a signature For reference, 25MiB is more than twice the size of all of the current binary repos (with files) combined, and 16KiB is a truly gargantuan signature. Signed-off-by: Dave Reisner <dreisner@archlinux.org>