Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
This project had 2 modules :
- merlin: it has all the system test for falcon
- merlin-core: it has all the utils used by merlin
In addition to falcon server and prism, running full falcon regression requires three clusters. Each of these clusters must have:
- hadoop
- oozie
- hive
- hcat For specific tests it may be possible to run it without all clusters and components.
Prior to running tests needs to be created and populated with cluster details. must be created before running falcon regression tests. The file must be created at the location:
Populate it with prism related properties:
#prism properties
prism.oozie_url =
prism.oozie_location = /usr/lib/oozie/bin
prism.qa_host =
prism.service_user = falcon
prism.hadoop_url =
prism.hadoop_location = /usr/lib/hadoop/bin/hadoop
prism.hostname =
prism.storeLocation = hdfs://
Specify the clusters that you would be using for testing:
servers = cluster1,cluster2,cluster3
For each cluster specify properties:
#cluster1 properties
cluster1.oozie_url =
cluster1.oozie_location = /usr/lib/oozie/bin
cluster1.qa_host =
cluster1.service_user = falcon
cluster1.password = rgautam
cluster1.hadoop_url =
cluster1.hadoop_location = /usr/lib/hadoop/bin/hadoop
cluster1.hostname =
cluster1.cluster_readonly = webhdfs://
cluster1.cluster_execute =
cluster1.cluster_write = hdfs://
cluster1.activemq_url = tcp://
cluster1.storeLocation = hdfs://
cluster1.colo = default
cluster1.namenode.kerberos.principal = nn/
cluster1.hive.metastore.kerberos.principal = hive/
cluster1.hcat_endpoint = thrift://
cluster1.service_stop_cmd = /usr/lib/falcon/bin/falcon-stop
cluster1.service_start_cmd = /usr/lib/falcon/bin/falcon-start
To not clean root tests dir before every test:
On all cluster as user that started falcon server do:
hdfs dfs -mkdir -p /tmp/falcon-regression-staging
hdfs dfs -chmod 777 /tmp/falcon-regression-staging
hdfs dfs -mkdir -p /tmp/falcon-regression-working
hdfs dfs -chmod 755 /tmp/falcon-regression-working
After creating file. You can run the following commands to run the tests.
cd falcon-regression
mvn clean test -Phadoop-2
Profiles Supported: hadoop-2
To run a specific test:
mvn clean test -Phadoop-2 -Dtest=EmbeddedPigScriptTest
If you want to use specific version of any component, they can be specified using -D, for eg:
mvn clean test -Phadoop-2 -Doozie.version=4.1.0 -Dhadoop.version=2.6.0
ACL tests require multiple user account setup:
ACL tests also require group name of the current user:
For testing with kerberos set keytabs properties for different users:
If you wish to contribute to falcon regression, it's as easy as it gets. All test classes must be added to the directory:
This directory contains sub directories such as prism, ui, security, etc which contain tests specific to these aspects of falcon. Any general test can be added directly to the parent directory above. If you wish to write a series of tests for a new feature, feel free to create a new sub directory. Your test can use the various process/feed/cluster/workflow templates present in:
or you can add your own bundle of XMLs in this directory. Please avoid redundancy of any resource.
Each test class can contain multiple related tests. Let us look at a sample test class. Refer to comments in code for aid :
//The License note must be added to each test
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* See the License for the specific language governing permissions and
* limitations under the License.
package org.apache.falcon.regression;
import org.apache.falcon.regression.core.bundle.Bundle;
import org.apache.falcon.regression.core.helpers.ColoHelper;
import org.apache.falcon.regression.core.response.ServiceResponse;
import org.apache.falcon.regression.core.util.AssertUtil;
import org.apache.falcon.regression.core.util.BundleUtil;
import org.apache.falcon.regression.testHelper.BaseTestClass;
import org.testng.annotations.AfterMethod;
import org.testng.annotations.BeforeMethod;
import org.testng.annotations.Test;
@Test(groups = "embedded")
//Every test class must inherit the BaseTestClass. This class
//helps using properties mentioned in, in the test.
public class FeedSubmitTest extends BaseTestClass {
private ColoHelper cluster = servers.get(0);
private String feed;
@BeforeMethod(alwaysRun = true)
public void setUp() throws Exception {
//Several Util classes are available, such as BundleUtil, which for example
//has been used here to read the ELBundle present in falcon/falcon-regression/src/test/resources
bundles[0] = BundleUtil.readELBundle();
bundles[0] = new Bundle(bundles[0], cluster);
//submit the cluster
ServiceResponse response =
feed = bundles[0].getInputFeedFromBundle();
@AfterMethod(alwaysRun = true)
public void tearDown() {
//Java docs must be added for each test function, explaining what the function does
* Submit correctly adjusted feed. Response should reflect success.
* @throws Exception
@Test(groups = {"singleCluster"})
public void submitValidFeed() throws Exception {
ServiceResponse response = prism.getFeedHelper().submitEntity(feed);
* Submit and remove feed. Try to submit it again. Response should reflect success.
* @throws Exception
@Test(groups = {"singleCluster"})
public void submitValidFeedPostDeletion() throws Exception {
ServiceResponse response = prism.getFeedHelper().submitEntity(feed);
response = prism.getFeedHelper().delete(feed);
response = prism.getFeedHelper().submitEntity(feed);
This class, as the name suggests was to test the Feed Submition aspect of Falcon. It contains multiple test functions, all of which however are various test cases for the same feature. This organisation in code must be maintained.
In order to be able to manipulate feeds, processes and clusters for the various tests, objects of classes FeedMerlin, ProcessMerlin, ClusterMerlin can be used. There are already existing functions which use these objects, such as setProcessInput, setFeedValidity, setProcessConcurrency, setInputFeedPeriodicity etc. in which should serve your purpose well enough.
To add more on the utils, you can use functions in HadoopUtil to create HDFS dirs, delete them, and add data on HDFS, OozieUtil to hit Oozie for checking coordinator/workflow status, TimeUtil to get lists of dates and directories to aid in data creation, HCatUtil for Hcatalog related utilities, and many others to make writing tests very easy.
Coding conventions are strictly followed. Use the checkstyle xml present in falcon/checkstyle/src/main/resources/falcon
in your project to not get checkstyle errors.
Some tests switch user to run commands as a different user. Location of binary to switch user is configurable:
For full falcon regression runs. It might be desirable to pull all oozie job info and logs at the end of the test. This can be done by configuring
log.capture.oozie = true
log.capture.oozie.skip_info = false
log.capture.oozie.skip_log = true
log.capture.location = ../
Dumping entities generated by falcon
Add -Dmerlin.dump.staging to the maven command. For example:
mvn clean test -Phadoop-2 -Dmerlin.dump.staging=true