Merge pull request datastax#785 from sheldonkhall/master

updated documentation to save others time doing what might be a common task
bigdatafly · Sep 16, 2015 · c4d9e87 · c4d9e87
2 parents a10311c + 3ee31e5
commit c4d9e87
Showing 1 changed file with 22 additions and 0 deletions.
diff --git a/doc/5_saving.md b/doc/5_saving.md
@@ -278,6 +278,28 @@ val collection = sc.parallelize(Seq(WordCount("dog", 50), WordCount("cow", 60)))
 collection.saveAsCassandraTableEx(table2, SomeColumns("word", "count"))
 ```
 
+To create a table with a custom definition, and define which columns are to be partition and clustering column keys:
+
+```
+import com.datastax.spark.connector.cql.{ColumnDef, RegularColumn, TableDef, ClusteringColumn, PartitionKeyColumn}
+import com.datastax.spark.connector.types._
+
+// Define structure for rdd data
+case class outData(col1:UUID, col2:UUID, col3: Double, col4:Int)
+
+// Define columns
+val p1Col = new ColumnDef("col1",PartitionKeyColumn,UUIDType)
+val c1Col = new ColumnDef("col2",ClusteringColumn(0),UUIDType)
+val c2Col = new ColumnDef("col3",ClusteringColumn(1),DoubleType)
+val rCol = new ColumnDef("col4",RegularColumn,IntType)
+
+// Create table definition
+val table = TableDef("test","words",Seq(p1Col),Seq(c1Col, c2Col),Seq(rCol))
+
+// Map rdd into custom data structure and create table
+val rddOut = rdd.map(s => outData(s._1, s._2(0), s._2(1), s._3))
+rddOut.saveAsCassandraTableEx(table, SomeColumns("col1", "col2", "col3", "col4"))
+```
 
 ## Tuning
 The following properties set in `SparkConf` can be used to fine-tune the saving process,