Conversation
Motivation: While only tests, these files use an obsolete property name `pool.hotfile.monitoring.enable`, and could cause/perpetuate confusion and errors in user configuration. Modification: `pool.hotfile.monitoring.enable` changed to `pool.hotfile.replication.enable` in all cases. Result: Property name use in tests now comports with user applications. Target: master Request: 11.2 Patch: https://rb.dcache.org/r/14654/diff/raw Closes: Requires-notes: yes Requires-book: no Acked-by: Anastasiia Chub
There was a problem hiding this comment.
Pull request overview
This PR is a release-branch rollup for the 11.2 line, combining multiple operational fixes and feature updates across pool tooling, SciTags/Firefly marker handling, hotfile replication pool selection, WebDAV/TPC metadata propagation, and dependency/version bumps.
Changes:
- Improve SciTags/Firefly marker generation, logging, and delivery (including optional additional collector and HTTPS/remote HTTP support).
- Enhance hotfile replication to select pools via PoolManager query (respecting highest read-preference level) and pass protocol/net-unit context.
- Add/extend operational tooling and config support (fio-based pool benchmark command, pool size percentage handling, Jackson JDK8 Optional support), plus dependency and version updates.
Reviewed changes
Copilot reviewed 99 out of 100 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| skel/share/lib/utils.sh | Add total-space helper for pool sizing. |
| skel/share/lib/pool.sh | Fix pool size rendering when configured as a percentage. |
| skel/share/lib/pool-benchmark.fio | New fio workload profile for pool benchmarking. |
| skel/share/defaults/pool.properties | Clarify Firefly destination/collector configuration text. |
| skel/man/man8/dcache.8 | Document new dcache pool benchmark command. |
| skel/bin/dcache | Add pool benchmark command that runs fio against pool filesystems. |
| pom.xml | Bump versions; add Jackson jdk8 datatype dependency management. |
| plugins/pom.xml | Bump parent version. |
| plugins/hsqldb/pom.xml | Bump parent version. |
| packages/tar/pom.xml | Bump parent version. |
| packages/system-test/pom.xml | Bump parent version. |
| packages/pom.xml | Bump parent version. |
| packages/fhs/pom.xml | Bump parent version. |
| modules/srm-server/pom.xml | Bump parent version. |
| modules/srm-common/pom.xml | Bump parent version. |
| modules/srm-client/pom.xml | Bump parent version. |
| modules/missingfiles-semsg/pom.xml | Bump parent version. |
| modules/logback-test-config/pom.xml | Bump module version. |
| modules/logback-console-config/pom.xml | Bump parent version. |
| modules/gplazma2/pom.xml | Bump parent version. |
| modules/gplazma2-voms/pom.xml | Bump parent version. |
| modules/gplazma2-scitoken/pom.xml | Bump parent version. |
| modules/gplazma2-roles/pom.xml | Bump parent version. |
| modules/gplazma2-pyscript/pom.xml | Bump parent version. |
| modules/gplazma2-omnisession/pom.xml | Bump parent version. |
| modules/gplazma2-oidc/pom.xml | Bump parent version. |
| modules/gplazma2-oidc-te/pom.xml | Bump parent version. |
| modules/gplazma2-nsswitch/pom.xml | Bump parent version. |
| modules/gplazma2-nis/pom.xml | Bump parent version. |
| modules/gplazma2-multimap/pom.xml | Bump parent version. |
| modules/gplazma2-ldap/pom.xml | Bump parent version. |
| modules/gplazma2-krb5/pom.xml | Bump parent version. |
| modules/gplazma2-kpwd/pom.xml | Bump parent version. |
| modules/gplazma2-jaas/pom.xml | Bump parent version. |
| modules/gplazma2-htpasswd/pom.xml | Bump parent version. |
| modules/gplazma2-grid/pom.xml | Bump parent version. |
| modules/gplazma2-fermi/pom.xml | Bump parent version. |
| modules/gplazma2-banfile/pom.xml | Bump parent version. |
| modules/gplazma2-alise/pom.xml | Bump parent version. |
| modules/ftp-client/pom.xml | Bump parent version. |
| modules/dcache/src/test/resources/org/dcache/pool/classic/hotfile-monitoring-test.xml | Update enable-flag property name used by Spring test context. |
| modules/dcache/src/test/java/org/dcache/pool/movers/TransferLifeCycleTest.java | Add regression tests for SciTag range/FQAN fallback/excludes and lifecycle timestamps. |
| modules/dcache/src/test/java/org/dcache/pool/migration/PoolListByPoolMgrQueryTest.java | Add unit tests for PoolMgrQuery-based pool selection behavior. |
| modules/dcache/src/test/java/org/dcache/pool/migration/MigrationModuleTest.java | Extend tests for protocol/net-unit derivation passed to PoolManager query. |
| modules/dcache/src/test/java/org/dcache/pool/classic/MoverRequestSchedulerTest.java | Add tests ensuring P2P requests are excluded from request counting. |
| modules/dcache/src/test/java/org/dcache/pool/classic/IoQueueManagerTest.java | Add tests for request-count aggregation across queues excluding P2P. |
| modules/dcache/src/test/java/org/dcache/pool/classic/HotfileMonitoringTest.java | Update property names to match new enable flag. |
| modules/dcache/src/test/java/org/dcache/pool/classic/AbstractMoverProtocolTransferServiceTest.java | Extend setup to provide protocol info and file attributes for new logging paths. |
| modules/dcache/src/main/resources/org/dcache/pool/classic/pool.xml | Wire TransferLifeCycle into remote HTTP transfer service. |
| modules/dcache/src/main/java/org/dcache/pool/movers/TransferLifeCycle.java | Extend marker logic (protocol coverage, excludes semantics, additional collector, richer logging, correct timing). |
| modules/dcache/src/main/java/org/dcache/pool/migration/PoolListByPoolMgrQuery.java | New RefreshablePoolList querying PoolManager and selecting highest-preference pools only. |
| modules/dcache/src/main/java/org/dcache/pool/migration/MigrationModule.java | Use PoolMgrQuery-based selection for hotfile replication and include protocol/net-unit context. |
| modules/dcache/src/main/java/org/dcache/pool/classic/PoolV4.java | Pass ProtocolInfo into hotfile request monitoring hook. |
| modules/dcache/src/main/java/org/dcache/pool/classic/MoverRequestScheduler.java | Exclude P2P movers from request-count calculation. |
| modules/dcache/src/main/java/org/dcache/pool/classic/FileRequestMonitor.java | Extend interface to accept ProtocolInfo for better selection context. |
| modules/dcache/src/main/java/org/dcache/pool/classic/DefaultPostTransferService.java | Harden/extend SciTags lifecycle end handling and logging; derive local endpoint when missing. |
| modules/dcache/src/main/java/org/dcache/pool/classic/AbstractMoverProtocolTransferService.java | Add SciTags debug logging on abort paths; add TransferLifeCycle setter. |
| modules/dcache/pom.xml | Bump parent version. |
| modules/dcache-xrootd/src/main/java/org/dcache/xrootd/door/XrootdTransfer.java | Add SciTags request logging and transferTag debug logging. |
| modules/dcache-xrootd/src/main/java/org/dcache/xrootd/door/XrootdDoor.java | Invoke SciTags request logging during transfer creation. |
| modules/dcache-xrootd/pom.xml | Bump parent version. |
| modules/dcache-webdav/src/test/java/org/dcache/webdav/DcacheResourceFactoryTest.java | Add tests for case-insensitive SciTag header lookup. |
| modules/dcache-webdav/src/main/java/org/dcache/webdav/transfer/RemoteTransferHandler.java | Propagate transferTag and ArchiveMetadata; set transferTag on remote HTTP(S) protocol info. |
| modules/dcache-webdav/src/main/java/org/dcache/webdav/transfer/CopyFilter.java | Capture SciTag headers and choose transferTag for pool/TPC. |
| modules/dcache-webdav/src/main/java/org/dcache/webdav/LoggingHandler.java | Add SciTag fields to WebDAV request logging. |
| modules/dcache-webdav/src/main/java/org/dcache/webdav/DcacheResourceFactory.java | Add case-insensitive SciTag header lookup; log request source; propagate ArchiveMetadata for custodial uploads. |
| modules/dcache-webdav/pom.xml | Bump parent version. |
| modules/dcache-vehicles/src/main/java/diskCacheV111/vehicles/RemoteHttpDataTransferProtocolInfo.java | Add transferTag field with backward-compatible deserialization. |
| modules/dcache-vehicles/pom.xml | Bump parent version. |
| modules/dcache-srm/pom.xml | Bump parent version. |
| modules/dcache-spacemanager/src/main/java/diskCacheV111/services/space/SpaceManagerService.java | Ensure access latency/retention policy are always set (override with warnings). |
| modules/dcache-spacemanager/pom.xml | Bump parent version. |
| modules/dcache-resilience/pom.xml | Bump parent version. |
| modules/dcache-qos/pom.xml | Bump parent version. |
| modules/dcache-nfs/pom.xml | Bump parent version. |
| modules/dcache-nearline-spi/pom.xml | Bump parent version. |
| modules/dcache-info/pom.xml | Bump parent version. |
| modules/dcache-history/pom.xml | Bump parent version. |
| modules/dcache-gplazma/pom.xml | Bump parent version. |
| modules/dcache-ftp/pom.xml | Bump parent version. |
| modules/dcache-frontend/src/main/java/org/dcache/restful/providers/ObjectMapperProvider.java | Register Jackson Jdk8Module to support Optional types. |
| modules/dcache-frontend/pom.xml | Add jackson-datatype-jdk8 dependency. |
| modules/dcache-dcap/pom.xml | Bump parent version. |
| modules/dcache-chimera/src/main/java/org/dcache/chimera/namespace/ChimeraHsmStorageInfoExtractor.java | Ensure defaults are returned (no null) for access latency/retention policy. |
| modules/dcache-chimera/pom.xml | Bump parent version. |
| modules/dcache-bulk/pom.xml | Bump parent version. |
| modules/common/pom.xml | Bump parent version. |
| modules/common-security/pom.xml | Bump parent version. |
| modules/common-cli/pom.xml | Bump parent version. |
| modules/chimera/pom.xml | Bump parent version. |
| modules/cells/pom.xml | Bump parent version. |
| modules/benchmarks/pom.xml | Bump parent version. |
| modules/acl/pom.xml | Bump parent version. |
| modules/acl-vehicles/pom.xml | Bump parent version. |
| docs/UserGuide/pom.xml | Bump documentation module version. |
| docs/TheBook/pom.xml | Bump documentation module version. |
| docs/pom.xml | Bump documentation parent version. |
| docs/dev/hotfile-replication.md | Add design/ops documentation for hotfile replication and pool selection. |
| archetypes/pom.xml | Bump parent version. |
| archetypes/dcache-nearline-plugin-archetype/pom.xml | Bump parent version. |
| # Extracts the amount of free space in GiB. | ||
| getTotalSpace() # in $1 = path | ||
| { | ||
| [ -d "$1" ] && ( df -k "${1}" | awk 'NR == 2 { if (NF < 4) { getline; x = $1 } else { x = $2 }; printf "%d", x / (1024 * 1024)}' ) | ||
| } |
There was a problem hiding this comment.
The new getTotalSpace helper returns total filesystem size, but the preceding comment still says it extracts free space. Please update the comment to avoid confusion (getFreeSpace already exists).
| *%) | ||
| total=$(getTotalSpace "$path") | ||
| percent=$(echo "${size}" | tr -d "%") | ||
| size=$((${total}*${percent}/100)) | ||
| stringToGiB "${size}G" size | ||
| echo "${size}G" | ||
| ;; |
There was a problem hiding this comment.
This new % case uses tab indentation while the surrounding file uses spaces. Please align indentation with the existing style to keep diffs readable and avoid formatting issues in shellcheck/editors.
| *%) | ||
| total=$(getTotalSpace "$path") | ||
| percent=$(echo "${size}" | tr -d "%") | ||
| size=$((${total}*${percent}/100)) | ||
| stringToGiB "${size}G" size | ||
| echo "${size}G" |
There was a problem hiding this comment.
In the new percentage branch, total and percent are not declared local, so they become global shell variables and can accidentally affect other functions in the script. Please declare them local (and consider keeping all arithmetic variables scoped within getSizeOfPool).
| [concurrent_randread] | ||
| description=Concurrent Random READ | ||
| rw=randread | ||
| rw=read | ||
| ; check checksum during read |
There was a problem hiding this comment.
In the concurrent_randread job, rw is specified twice (rw=randread and then rw=read), so the second value overrides the first and the job won’t run random reads as intended. Please remove the duplicate and keep the correct rw setting.
| ; NOTE: ALL READ JOBS DELETE TEST FILE AFTER BENCHMARK | ||
| ; | ||
|
|
||
| [global] | ||
| filename_format=fio-test-$jobnum-$filenum | ||
| ; file size | ||
| size=10g | ||
| ioengine=libaio | ||
| ; do not pre-create files | ||
| create_on_open=1 | ||
| ; checksum type | ||
| verify=sha1 | ||
| ; fail on first checksum error | ||
| verify_fatal=1 | ||
| ; invalidate page cache before running jobs | ||
| invalidate=1 | ||
|
|
||
| [seqwrite] | ||
| description=Streaming WRITE | ||
| rw=write | ||
| ; do not check sum during write | ||
| do_verify=0 | ||
| ; block until job is done before jumping to the next section | ||
| stonewall | ||
|
|
||
| [seqread] | ||
| description=Streaming READ | ||
| rw=read | ||
| ; check checksum during read | ||
| do_verify=1 | ||
| ; block until job is done before jumping to the next section | ||
| stonewall |
There was a problem hiding this comment.
The header comment says all read jobs delete the test file, but the seqread job does not set unlink=1 (only randread does). Either update the comment or add unlink=1 to seqread to match the documented behavior.
| private PostTransferService _postTransferService; | ||
| private TransferLifeCycle _transferLifeCycle; | ||
|
|
||
|
|
||
| @Required | ||
| public void setPostTransferService(PostTransferService postTransferService) { | ||
| _postTransferService = postTransferService; | ||
| } | ||
|
|
||
| public void setTransferLifeCycle(TransferLifeCycle transferLifeCycle) { | ||
| _transferLifeCycle = transferLifeCycle; | ||
| } | ||
|
|
There was a problem hiding this comment.
_transferLifeCycle is added with a setter but is not used anywhere in this class. If it’s required for wiring, please integrate it (e.g., invoke lifecycle hooks) or remove it to avoid unused-field warnings and potential Checkstyle/SpotBugs failures.
| /** optional hits to tape system how to store the file */ | ||
| private final Optional<String> _archiveMetadata; |
There was a problem hiding this comment.
Typo in comment: “optional hits to tape system” should be “optional hints to tape system”.
| <dependency> | ||
| <groupId>com.fasterxml.jackson.datatype</groupId> | ||
| <artifactId>jackson-datatype-jdk8</artifactId> | ||
| <version>${version.jackson}</version> | ||
| </dependency> | ||
|
|
There was a problem hiding this comment.
The newly added jackson-datatype-jdk8 dependency block is indented with tabs and doesn’t match the surrounding XML formatting. Please reformat it to be consistent with the rest of the POM (spaces/alignment).
| getExperimentId = TransferLifeCycle.class.getDeclaredMethod( | ||
| "getExperimentId", ProtocolInfo.class, Subject.class); | ||
| getExperimentId.setAccessible(true); | ||
| } |
There was a problem hiding this comment.
This test uses reflection to access the private getExperimentId method. In this codebase, tests often prefer exposing such helpers with @VisibleForTesting (package-private) rather than reflective access, which is brittle to refactors/renames and can be blocked by stricter JVM access rules.
| .TP | ||
| .B pool benchmark [fio options ] [ -- directory] | ||
| Run filesystem benchmarks on the file system containing the pool's \fBdata directory. | ||
| The benchmark uses the fio tool. The options passed to the command are passed directly to fio. | ||
| If a path is specified, the benchmark is run on specified directory only. Otherwise, | ||
| the benchmark is run on all file systems for all configured pools. |
There was a problem hiding this comment.
The manpage synopsis for pool benchmark doesn’t exactly match the CLI usage string in skel/bin/dcache (extra space in [fio options ] and different bracket nesting for the optional directory). It would be good to keep these in sync to avoid confusing users.
dcache pool benchmarkcommand