Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL related tests failures with BCJSSE in FIPS 140 mode #49094

Closed
jkakavas opened this issue Nov 14, 2019 · 13 comments · Fixed by #119618
Closed

SSL related tests failures with BCJSSE in FIPS 140 mode #49094

jkakavas opened this issue Nov 14, 2019 · 13 comments · Fixed by #119618
Assignees
Labels
low-risk An open issue or test failure that is a low risk to future releases :Security/FIPS Running ES in FIPS 140-2 mode Team:Security Meta label for security team >test-failure Triaged test failures from CI

Comments

@jkakavas
Copy link
Member

jkakavas commented Nov 14, 2019

Example build scan: https://gradle-enterprise.elastic.co/s/i5x2i3udx2ifg

SSLConfigurationReloaderTests#testPEMKeyConfigReloading
SSLConfigurationReloaderTests#testReloadingPEMTrustConfig
SSLReloadIntegTests#testThatSSLConfigurationReloadsOnModification

all fail when using BCJSSE in FIPS 140 mode.
For

SSLConfigurationReloaderTests#testPEMKeyConfigReloading
SSLConfigurationReloaderTests#testReloadingPEMTrustConfig

the SSL handshake should fail but we would expect an exception similar to the SSLHandshakeException that SunJSSE throws.

SSLReloadIntegTests#testThatSSLConfigurationReloadsOnModification

shouldn't throw an exception at all.

All three of them cause the following stacktrace

  2> junit.framework.AssertionFailedError: Unexpected exception type, expected SSLHandshakeException but got org.bouncycastle.tls.TlsFatalAlert: certificate_unknown(46)
        at __randomizedtesting.SeedInfo.seed([9CD1B599F6DF08E2:58A40AAB044F2230]:0)
        at org.apache.lucene.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2724)
        at org.apache.lucene.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2712)
        at org.elasticsearch.xpack.core.ssl.SSLConfigurationReloaderTests.lambda$testPEMKeyConfigReloading$11(SSLConfigurationReloaderTests.java:203)
        at org.elasticsearch.xpack.core.ssl.SSLConfigurationReloaderTests.validateSSLConfigurationIsReloaded(SSLConfigurationReloaderTests.java:548)
        at org.elasticsearch.xpack.core.ssl.SSLConfigurationReloaderTests.testPEMKeyConfigReloading(SSLConfigurationReloaderTests.java:210)

        Caused by:
        org.bouncycastle.tls.TlsFatalAlert: certificate_unknown(46)
            at org.bouncycastle.jsse.provider.ProvSSLSocketWrap.checkServerTrusted(ProvSSLSocketWrap.java:137)
            at org.bouncycastle.jsse.provider.ProvTlsClient$1.notifyServerCertificate(ProvTlsClient.java:263)
            at org.bouncycastle.tls.TlsUtils.processServerCertificate(TlsUtils.java:3838)
            at org.bouncycastle.tls.TlsClientProtocol.handleServerCertificate(TlsClientProtocol.java:554)
            at org.bouncycastle.tls.TlsClientProtocol.handleHandshakeMessage(TlsClientProtocol.java:434)
            at org.bouncycastle.tls.TlsProtocol.processHandshakeQueue(TlsProtocol.java:545)
            at org.bouncycastle.tls.TlsProtocol.processRecord(TlsProtocol.java:463)
            at org.bouncycastle.tls.RecordStream.readRecord(RecordStream.java:224)
            at org.bouncycastle.tls.TlsProtocol.safeReadRecord(TlsProtocol.java:686)
            at org.bouncycastle.tls.TlsProtocol.blockForHandshake(TlsProtocol.java:324)
            at org.bouncycastle.tls.TlsClientProtocol.connect(TlsClientProtocol.java:83)
            at org.bouncycastle.jsse.provider.ProvSSLSocketWrap.startHandshake(ProvSSLSocketWrap.java:595)
            at org.bouncycastle.jsse.provider.ProvSSLSocketWrap.startHandshake(ProvSSLSocketWrap.java:571)
            at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:436)
            at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384)
            at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
            at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:374)
            at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
            at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
            at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
            at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
            at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
            at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
            at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
            at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
            at org.elasticsearch.xpack.core.ssl.SSLConfigurationReloaderTests.lambda$testPEMKeyConfigReloading$9(SSLConfigurationReloaderTests.java:204)
            at org.elasticsearch.xpack.core.ssl.SSLConfigurationReloaderTests.lambda$privilegedConnect$25(SSLConfigurationReloaderTests.java:754)
            at java.base/java.security.AccessController.doPrivileged(AccessController.java:551)
            at org.elasticsearch.xpack.core.ssl.SSLConfigurationReloaderTests.privilegedConnect(SSLConfigurationReloaderTests.java:753)
            at org.elasticsearch.xpack.core.ssl.SSLConfigurationReloaderTests.lambda$testPEMKeyConfigReloading$10(SSLConfigurationReloaderTests.java:204)
            at org.apache.lucene.util.LuceneTestCase._expectThrows(LuceneTestCase.java:2842)
            at org.apache.lucene.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2717)
            ... 4 more

            Caused by:
            java.security.cert.CertificateException: unable to process certificates: TrustAnchor found but certificate validation failed.
                at org.bouncycastle.jsse.provider.ProvX509TrustManager.validatePath(ProvX509TrustManager.java:213)
                at org.bouncycastle.jsse.provider.ProvX509TrustManager.checkTrusted(ProvX509TrustManager.java:144)
                at org.bouncycastle.jsse.provider.ProvX509TrustManager.checkServerTrusted(ProvX509TrustManager.java:127)
                at org.bouncycastle.jsse.provider.ProvSSLSocketWrap.checkServerTrusted(ProvSSLSocketWrap.java:133)
                ... 35 more

                Caused by:
                java.security.cert.CertPathBuilderException: TrustAnchor found but certificate validation failed.
                    at org.bouncycastle.jcajce.provider.PKIXCertPathBuilderSpi.engineBuild(Unknown Source)
                    at java.base/java.security.cert.CertPathBuilder.build(CertPathBuilder.java:297)
                    at org.bouncycastle.jsse.provider.ProvX509TrustManager.validatePath(ProvX509TrustManager.java:200)
                    ... 38 more

                    Caused by:
                    java.security.SignatureException: certificate does not verify with supplied key
                        at org.bouncycastle.jcajce.provider.X509CertificateObject.checkSignature(Unknown Source)
                        at org.bouncycastle.jcajce.provider.X509CertificateObject.verify(Unknown Source)
                        at org.bouncycastle.jcajce.provider.CertPathValidatorUtilities.verifyX509Certificate(Unknown Source)
                        at org.bouncycastle.jcajce.provider.CertPathValidatorUtilities.findTrustAnchor(Unknown Source)
                        at org.bouncycastle.jcajce.provider.PKIXCertPathBuilderSpi.build(Unknown Source)
                        ... 41 more

which seems to indicate that the certificate signature cannot be verified by the JSSE provider ( regardless of the trust ) which is rather unexpected. It is not entirely obvious if this fails because of BCJSSE being in FIPS mode or because of simply using BCJSSE instead of SunJSSE, as we only use BCJSSE to run our FIPS 140 tests.

I've muted these 3 tests for now until we have a solution or resolution in place

@jkakavas jkakavas added >test-failure Triaged test failures from CI :Security/TLS SSL/TLS, Certificates labels Nov 14, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (:Security/Network)

jkakavas added a commit to jkakavas/elasticsearch that referenced this issue Feb 5, 2020
jkakavas added a commit that referenced this issue Feb 5, 2020
jkakavas added a commit that referenced this issue Feb 10, 2020
@jkakavas jkakavas changed the title SSL Reloading tests failures with BCJSSE in FIPS 140 mode SSL related tests failures with BCJSSE in FIPS 140 mode Feb 10, 2020
@rjernst rjernst added the Team:Security Meta label for security team label May 4, 2020
jkakavas added a commit to jkakavas/elasticsearch that referenced this issue May 18, 2020
jkakavas added a commit that referenced this issue May 18, 2020
@jkakavas
Copy link
Member Author

jkakavas commented Jun 3, 2020

Another case is EmailSslTests#testNotificationSslSettingsOverrideSmtpSslTrust where BouncyCastle FIPS provider throws an org.bouncycastle.tls.TLSFatalAlert instead of an SSLException

jkakavas added a commit to jkakavas/elasticsearch that referenced this issue Jun 9, 2020
jkakavas added a commit that referenced this issue Jun 9, 2020
Cause is tracked in #49094
Backport of #51992
jkakavas added a commit to jkakavas/elasticsearch that referenced this issue Jun 17, 2020
jkakavas added a commit that referenced this issue Jun 17, 2020
@dliappis
Copy link
Contributor

dliappis commented Oct 13, 2020

I also saw org.bouncycastle.tls.TlsFatalAlert: handshake_failure(40) in https://gradle-enterprise.elastic.co/s/ugq5y3o4yrdgq during x-pack:plugin:security:internalClusterTest / testSnapshotUserRoleCanSnapshotAndSeeAllIndices

./gradlew ':x-pack:plugin:security:internalClusterTest' --tests "org.elasticsearch.xpack.security.authz.SnapshotUserRoleIntegTests.testSnapshotUserRoleCanSnapshotAndSeeAllIndices" -Dtests.seed=F4A3DCA82E28F4F1 -Dtests.security.manager=true -Dtests.locale=tr-TR -Dtests.timezone=Etc/GMT-4 -Druntime.java=11 -Dtests.fips.enabled=true

Unfortunately I couldn't reproduce this locally.

@jkakavas
Copy link
Member Author

Thanks @dliappis !

In the above, the handshake_failure error is causing the cluster to not form and thus the testSnapshotUserRoleCanSnapshotAndSeeAllIndices fails because

org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (delete_repository [*]) within 30s

so this is not related to the kind of failures we are tracking here.

The reproduction line is from FieldLevelSecurityRandomTests though. Maybe copied over from a different failure log ?

@dliappis
Copy link
Contributor

The reproduction line is from FieldLevelSecurityRandomTests though. Maybe copied over from a different failure log ?

Ugh my mistake, sorry. Will update the repro.

@dliappis
Copy link
Contributor

I have updated the above repro it doesn't reproduce locally though.

@jkakavas do you think it makes sense to track this in a separate bug?

@jkakavas
Copy link
Member Author

It has failed exactly once with this and I believe that the TLS exceptions might be a symptom and not the cause here. I will keep track of this for the next couple of days but since it doesn't reproduce, I don't think we should mute or track this in an issue for now

@astefan
Copy link
Contributor

astefan commented Oct 22, 2020

I caught a similar report as the one from @dliappis on CI this morning:

  2> 十月 22, 2020 5:37:02 上午 org.bouncycastle.jsse.provider.ProvTlsServer notifyAlertRaised
  2> 資訊: Server raised fatal(2) handshake_failure(40) alert: Failed to process record
  2> org.bouncycastle.tls.TlsFatalAlert: handshake_failure(40)
  2> 	at org.bouncycastle.tls.TlsProtocol.handleAlertWarningMessage(TlsProtocol.java:184)
  2> 	at org.bouncycastle.tls.TlsServerProtocol.handleAlertWarningMessage(TlsServerProtocol.java:413)
  2> 	at org.bouncycastle.tls.TlsProtocol.handleAlertMessage(TlsProtocol.java:161)
  2> 	at org.bouncycastle.tls.TlsProtocol.processAlertQueue(TlsProtocol.java:570)
  2> 	at org.bouncycastle.tls.TlsProtocol.processRecord(TlsProtocol.java:435)
  2> 	at org.bouncycastle.tls.RecordStream.readFullRecord(RecordStream.java:184)
  2> 	at org.bouncycastle.tls.TlsProtocol.safeReadFullRecord(TlsProtocol.java:727)
  2> 	at org.bouncycastle.tls.TlsProtocol.offerInput(TlsProtocol.java:1059)
  2> 	at org.bouncycastle.tls.TlsProtocol.offerInput(TlsProtocol.java:1027)
  2> 	at org.bouncycastle.jsse.provider.ProvSSLEngine.unwrap(ProvSSLEngine.java:445)
  2> 	at java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:677)
  2> 	at org.elasticsearch.xpack.security.transport.nio.SSLDriver.unwrap(SSLDriver.java:178)
  2> 	at org.elasticsearch.xpack.security.transport.nio.SSLDriver.access$1300(SSLDriver.java:50)
  2> 	at org.elasticsearch.xpack.security.transport.nio.SSLDriver$RegularMode.read(SSLDriver.java:327)
  2> 	at org.elasticsearch.xpack.security.transport.nio.SSLDriver.read(SSLDriver.java:119)
  2> 	at org.elasticsearch.xpack.security.transport.nio.SSLChannelContext.read(SSLChannelContext.java:165)
  2> 	at org.elasticsearch.nio.EventHandler.handleRead(EventHandler.java:139)
  2> 	at org.elasticsearch.nio.NioSelector.handleRead(NioSelector.java:420)
  2> 	at org.elasticsearch.nio.NioSelector.processKey(NioSelector.java:246)
  2> 	at org.elasticsearch.nio.NioSelector.singleLoop(NioSelector.java:174)
  2> 	at org.elasticsearch.nio.NioSelector.runLoop(NioSelector.java:131)
  2> 	at java.base/java.lang.Thread.run(Thread.java:834)

with the following test failing:

org.elasticsearch.xpack.security.transport.ServerTransportFilterIntegrationTests > testThatConnectionToClientTypeConnectionIsRejected FAILED
    ProcessClusterEventTimeoutException[failed to process cluster event (delete_repository [*]) within 30s]
        at __randomizedtesting.SeedInfo.seed([EB44AB73D78B668B:D02A53132EE64A57]:0)
        at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$0(MasterService.java:143)
        at java.util.ArrayList.forEach(ArrayList.java:1541)
        at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$1(MasterService.java:142)
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:678)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.lang.Thread.run(Thread.java:834)

But doesn't repro for me.

./gradlew ':x-pack:plugin:security:internalClusterTest' --tests "org.elasticsearch.xpack.security.transport.ServerTransportFilterIntegrationTests.testThatConnectionToClientTypeConnectionIsRejected" -Dtests.seed=EB44AB73D78B668B -Dtests.security.manager=true -Dtests.locale=zh-TW -Dtests.timezone=MET -Druntime.java=11 -Dtests.fips.enabled=true

Build scan https://gradle-enterprise.elastic.co/s/anyrby5ddynwq

@jkakavas
Copy link
Member Author

These are all unrelated to the problem this issue is tracking, so can we open a new issue so that we don't miss it?

@astefan
Copy link
Contributor

astefan commented Oct 22, 2020

Done #64044. Thank you @jkakavas.

@ywangd ywangd added the :Security/FIPS Running ES in FIPS 140-2 mode label Dec 31, 2020
@ywangd ywangd removed the :Security/TLS SSL/TLS, Certificates label Dec 31, 2020
@albertzaharovits albertzaharovits self-assigned this Jan 14, 2021
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

@gwbrown gwbrown added the low-risk An open issue or test failure that is a low risk to future releases label Oct 12, 2023
@elasticsearchmachine elasticsearchmachine closed this as not planned Won't fix, can't repro, duplicate, stale Nov 5, 2024
@elasticsearchmachine
Copy link
Collaborator

This issue has been closed because it has been open for too long with no activity.

Any muted tests that were associated with this issue have been unmuted.

If the tests begin failing again, a new issue will be opened, and they may be muted again.

@slobodanadamovic
Copy link
Contributor

slobodanadamovic commented Nov 6, 2024

Reopening. The tests are still muted in FIPS mode. We should unmute them and see if we still get failures.

@jakelandis jakelandis self-assigned this Jan 6, 2025
jakelandis added a commit to jakelandis/elasticsearch that referenced this issue Jan 8, 2025
Fixes and un-mutes tests associated with FIPS.
Most of the fixes are due to differing expected exceptions or log messages when using BouncyCastle as the JCE/JSSE provider.
Only test code is changed with this commit.

fixes: elastic#49094
elasticsearchmachine pushed a commit that referenced this issue Jan 8, 2025
Fixes and un-mutes tests associated with FIPS.
Most of the fixes are due to differing expected exceptions or log messages when using BouncyCastle as the JCE/JSSE provider.
Only test code is changed with this commit.

fixes: #49094
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
low-risk An open issue or test failure that is a low risk to future releases :Security/FIPS Running ES in FIPS 140-2 mode Team:Security Meta label for security team >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.