Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spooky failure to start with "neither features.syslogAsset or multicollins.thisInstance were specified" #398

Open
cburroughs opened this issue Jan 20, 2016 · 16 comments

Comments

@cburroughs
Copy link
Contributor

This is on 1.3.0. I had a pair (per DC) of collins instances running. I don't think they had been restarted in O(months). Today I tried to migrate those instances to new servers (copied conf, dumped db etc) and collins failed to start with [1]. The spooky outage inducing part came when I tried to restart the old servers and they failed with the the same error despite no changes to the config. I checked many times over that multicollins was indeed in the config (and that I got it to work later leads me to believe that as far as multicollins itself is concerned everything is/was fine).

The eventual workaround was to set:

features {
   syslogAsset = "NameOfDC"
}

in production.conf.

Related discussion in #117 Again as far as I can tell neither the code nor the config changed since the last restart.

I looked into the current state of master: https://github.com/tumblr/collins/blob/master/app/collins/util/Tattler.scala#L83

lazy val syslogAsset = Asset.findByTag(Feature.syslogAsset.getOrElse("tumblrtag1")).getOrElse {
    throw new PlayException("", "neither features.syslogAsset or multicollins.thisInstance were specified")
  }

and I'm afraid I don't understand what is going on their either. The only reference to multicollins seems to be in the exception.

[1]

PlayException: Confguration error [neither features.syslogAsset or multicollins.thisInstance were specified]
        at util.config.ConfigAccessor$class.globalError(ConfigAccessor.scala:20)
        at util.config.Feature$.globalError(Feature.scala:10)
        at util.config.Feature$$anonfun$syslogAsset$2.apply(Feature.scala:35)
        at util.config.Feature$$anonfun$syslogAsset$2.apply(Feature.scala:35)
        at scala.Option.getOrElse(Option.scala:108)
        at util.config.Feature$.syslogAsset(Feature.scala:34)
        at util.config.Feature$.validateConfig(Feature.scala:61)
        at util.config.Configurable$class.mergeReferenceAndSave(Configurable.scala:106)
        at util.config.Feature$.mergeReferenceAndSave(Feature.scala:10)
        at util.config.Configurable$class.initialize(Configurable.scala:43)
        at util.config.Feature$.initialize(Feature.scala:10)
        at util.config.Registry$$anonfun$validate$1.apply(Registry.scala:52)
        at util.config.Registry$$anonfun$validate$1.apply(Registry.scala:52)
        at scala.collection.Iterator$class.foreach(Iterator.scala:660)
        at scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:573)
        at scala.collection.IterableLike$class.foreach(IterableLike.scala:73)
        at scala.collection.JavaConversions$JCollectionWrapper.foreach(JavaConversions.scala:592)
        at util.config.Registry$.validate(Registry.scala:52)
        at collins.config.ConfigPlugin.onStart(ConfigPlugin.scala:14)
        at play.api.Play$$anonfun$start$1$$anonfun$apply$mcV$sp$1.apply(Play.scala:84)
        at play.api.Play$$anonfun$start$1$$anonfun$apply$mcV$sp$1.apply(Play.scala:84)
        at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
        at scala.collection.immutable.List.foreach(List.scala:45)
        at play.api.Play$$anonfun$start$1.apply$mcV$sp(Play.scala:84)
        at play.api.Play$$anonfun$start$1.apply(Play.scala:84)
        at play.api.Play$$anonfun$start$1.apply(Play.scala:84)
        at play.utils.Threads$.withContextClassLoader(Threads.scala:17)
        at play.api.Play$.start(Play.scala:83)
        at play.core.StaticApplication.<init>(ApplicationProvider.scala:51)
        at play.core.server.NettyServer$.createServer(NettyServer.scala:136)
        at play.core.server.NettyServer$$anonfun$main$5.apply(NettyServer.scala:165)
        at play.core.server.NettyServer$$anonfun$main$5.apply(NettyServer.scala:164)
        at scala.Option.map(Option.scala:133)
        at play.core.server.NettyServer$.main(NettyServer.scala:164)
        at play.core.server.NettyServer.main(NettyServer.scala)
@byxorna
Copy link
Contributor

byxorna commented Jan 21, 2016

@cburroughs I have definitely seen this behavior before, but I cant remember exactly what triggered it. Generally I set multicollins on and use that feature to identify each instance instead of syslog asset. Seems to avoid the issue?

multicollins {
  enabled=true
  thisInstance = "IATA01"
}

@cburroughs
Copy link
Contributor Author

My multicolins config is

multicollins {
  enabled = true
  instanceAssetType = DATA_CENTER
  locationAttribute = LOCATION
  thisInstance = IAD
}

which is what worked until yesterday.

@yl3w
Copy link
Contributor

yl3w commented Jan 22, 2016

There were definitely initialization order issues in the 1.3.0 release. I can't say for certain that they have all been fixed in master, but I've addressed a few of those during the time. Unfortunately this isn't easy to reproduce and therefore fix.

@william-richard
Copy link
Contributor

@cburroughs - is the IAD DATA_CENTER asset still in your instance of collins? I think I've seen this error when the asset did not exist before. I would image it does exist, since you said you were running with that config until recently, just want to double check.

@william-richard
Copy link
Contributor

Also, this may help you debug, but this is where syslogAsset gets populated to the value specified in the multicollins config:
https://github.com/tumblr/collins/blob/v1.3.0/app/util/config/Feature.scala#L32-L36

@cburroughs
Copy link
Contributor Author

Yep, there is still an asset of type 'Data Center' with the tag IAD.

@william-richard
Copy link
Contributor

@cburroughs were you able to figure this out? Maybe switching on and off multicollins might help fix the problem?

@cburroughs
Copy link
Contributor Author

No I have not figured this out and I have not debugged further since features.syslogAsset workaround put out the fire.

@discordianfish
Copy link
Contributor

So I got this error:

Play server process ID is 7203
[info] play - database [collins] connected at jdbc:h2:mem:play
[error] application - Failed to create assetlog
play.api.PlayException: [neither features.syslogAsset or multicollins.thisInstance were specified]

After building collins with activator dist, extracting the zip file in a new directory and running:

java -server -Dhttp.port=9023 -Dconfig.file=$(pwd)/conf/application.conf -DapplyEvolutions.collins=true -cp `pwd`/lib/\* play.core.server.NettyServer /tmp

Turns out, this is caused missing test/resources/profiles.yaml. Maybe this issue at hand is similar?
Either way, this behavior is problematic.. Such runtime dependencies should be included in the dist zip file, it shouldn't be in test/ and ultimately the error is super misleading. I'm happy to fill some issues but not sure where to start..

@discordianfish
Copy link
Contributor

discordianfish commented Aug 10, 2016

Oh WTF. Now it's again not working on another system.. So well, dunno. Will update if I find out more.

@discordianfish
Copy link
Contributor

Okay, this was related to a different jvm version. But if this also triggers this error, there is something seriously wrong with the error handling.

@byxorna
Copy link
Contributor

byxorna commented Aug 10, 2016

I think we have seen this before, where a syntax error in the production.conf leads to very confusing error reporting due to a spaghettimess of try/catches around parsing configs. @discordianfish

I would check your production.conf very carefully and see if you can find the syntax error.

@discordianfish
Copy link
Contributor

@byxorna I used the config from the repo without any changes. My core issue was that file missing from my working directory.

@discordianfish
Copy link
Contributor

Oh what a PITA. Is it possible that right now it always returns the error not matter what is wrong during initialization? I got that error because of that missing file, because of something triggered by using java 1.8 instead of 1.7 and now because of something related to the database after changing h2:mem:play to h2:/var/lib/collins/database.h2..

Is there a more stable version? Can't get 1.3.0 running either and that's 2 years old already..

@discordianfish
Copy link
Contributor

..and for solr misconfiguration as well as permission problem.

@discordianfish
Copy link
Contributor

Okay, finally got this up and running but I'm wondering: Are there plans to fix the error handling?
I would look into it but scala is far outside my comfort zone.
I'm a bit worried of introducing it here with this issue pending and I don't know how active collins is still developed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants