Build: Bump hadoop from 3.4.3 to 3.5.0.#15897
Conversation
afa9bc6 to
f94972f
Compare
58516fe to
aa181ac
Compare
|
@nastra @huaxingao Could you help review this PR? Thank you very much! This change is mainly to support the newly released Hadoop version( |
| jetty-compression-server = { module = "org.eclipse.jetty.compression:jetty-compression-server", version.ref = "jetty" } | ||
| jetty-compression-gzip = { module = "org.eclipse.jetty.compression:jetty-compression-gzip", version.ref = "jetty" } | ||
| javax-servlet = { module = "javax.servlet:javax.servlet-api", version.ref = "javax-servlet-api" } | ||
| jetty-server = { module = "org.eclipse.jetty:jetty-server", version.ref = "jetty" } |
There was a problem hiding this comment.
this shouldn't be needed, because we're using jetty-compression-server already
There was a problem hiding this comment.
Removed. That catalog alias was unused in this PR.
| jakarta-servlet = {module = "jakarta.servlet:jakarta.servlet-api", version.ref = "jakarta-servlet-api"} | ||
| jetty-compression-server = { module = "org.eclipse.jetty.compression:jetty-compression-server", version.ref = "jetty" } | ||
| jetty-compression-gzip = { module = "org.eclipse.jetty.compression:jetty-compression-gzip", version.ref = "jetty" } | ||
| javax-servlet = { module = "javax.servlet:javax.servlet-api", version.ref = "javax-servlet-api" } |
There was a problem hiding this comment.
javax-servlet is deprecated and was replaced by jakarta-servlet. Is this something that we can use here instead?
There was a problem hiding this comment.
jakarta.servlet can't replace javax.servlet here. Spark 3.4/3.5 with Hive 2 still load legacy javax.servlet.* classes at runtime, and those packages are not binary-compatible. I kept the workaround scoped to testRuntimeOnly / integrationRuntimeOnly in the Spark 3.4 and 3.5 modules only.
There was a problem hiding this comment.
I would probably use the previous Hadoop version with Spark 3.5 to avoid pulling in legacy packages
ada4bb6 to
78ff01d
Compare
|
@nastra I’m very sorry that I wasn’t able to continue following up on this PR in a timely manner last month. Recently, I’ve made some improvements and refinements to the related changes. If you have time, could you please take another look when convenient? Thank you very much for your time and help! |
|
@nastra Could you help review this PR again? Thank you very much! |
| integrationRuntimeOnly project(path: ':iceberg-core', configuration: 'testArtifacts') | ||
| integrationRuntimeOnly (project(path: ':iceberg-open-api', configuration: 'testFixturesRuntimeElements')) | ||
| // Spark 3.4 + Hive 2 still load legacy javax.servlet classes at runtime | ||
| integrationRuntimeOnly libs.javax.servlet |
There was a problem hiding this comment.
please rebase, because Spark 3.4 module has been removed
There was a problem hiding this comment.
Thanks for pointing this out. I rebased onto the current master and dropped the Spark 3.4 change since that module has been removed upstream.
| integrationRuntimeOnly project(path: ':iceberg-core', configuration: 'testArtifacts') | ||
| integrationRuntimeOnly (project(path: ':iceberg-open-api', configuration: 'testFixturesRuntimeElements')) | ||
| // Spark 3.5 + Hive 2 still load legacy javax.servlet classes at runtime | ||
| integrationRuntimeOnly libs.javax.servlet |
There was a problem hiding this comment.
I would probably rather use the previous hadoop version with Spark 3.5 instead of pulling in these dependencies
There was a problem hiding this comment.
Good point, thanks. I checked the dependency path and Hadoop 3.5.0 is coming into the Spark 3.5 test and integration runtimes via :iceberg-open-api test fixtures. I dropped the javax.servlet workaround and pinned those runtimes back to Hadoop 3.4.3 instead.
088f497 to
58a1abc
Compare
306b8f4 to
5d6bbcb
Compare
| guava = "33.6.0-jre" | ||
| hadoop3 = "3.4.3" | ||
| hadoop3 = "3.5.0" | ||
| hadoop3-previous = "3.4.3" |
There was a problem hiding this comment.
let's name this hadoop3-spark35 to clearly indicate that this is only for Spark 3.5
There was a problem hiding this comment.
Done. Renamed it to hadoop3-spark35 to make the scope explicit and updated the Spark 3.5 references accordingly.
| org.codehaus.mojo:animal-sniffer-annotations:1.27 | ||
| org.codehaus.woodstox:stax2-api:4.2 | ||
| org.conscrypt:conscrypt-openjdk-uber:2.5 | ||
| org.glassfish.hk2.external:aopalliance-repackaged:2.6 |
There was a problem hiding this comment.
we should probably exclude all of the glassfish dependencies from the runtime module
There was a problem hiding this comment.
please also exclude this from testFixturesImplementation(libs.hadoop3.common)
There was a problem hiding this comment.
Added the Glassfish excludes on the Kafka Connect runtime hadoop3.common dependency and also mirrored them on :iceberg-open-api test fixtures. I also excluded org.javassist:javassist. Regenerated runtime-deps.txt and verified the runtime baseline again.
| exclude group: 'org.apache.hadoop', module: 'hadoop-auth' | ||
| exclude group: 'org.apache.commons', module: 'commons-configuration2' | ||
| exclude group: 'org.apache.hadoop.thirdparty', module: 'hadoop-shaded-protobuf_3_7' | ||
| exclude group: 'org.eclipse.jetty' |
There was a problem hiding this comment.
please add all of the glassfish excludes here. If possible also exclude javassist
| com.microsoft.azure:msal4j-persistence-extension:1.3 | ||
| com.microsoft.azure:msal4j:1.23 | ||
| com.sun.xml.bind:jaxb-impl:2.2 | ||
| com.sun.istack:istack-commons-runtime:3.0 |
There was a problem hiding this comment.
we need to check whether this is already included in the LICENSE file for the kafka-connect-runtime (most likely not, so it needs to be added there if this is a required dependency)
| javax.servlet:javax.servlet-api:3.1 | ||
| javax.xml.bind:jaxb-api:2.2 | ||
| javax.xml.stream:stax-api:1.0-2 | ||
| jakarta.annotation:jakarta.annotation-api:1.3 |
There was a problem hiding this comment.
LICENSE must be updated to reflect these new dependencies and the old ones must be removed
Build: Bump hadoop from 3.4.3 to 3.5.0.