Skip to content

Commit

Permalink
Merge pull request guacsec#46 from mihaimaruseac/nodes
Browse files Browse the repository at this point in the history
Add nodes for attestations, artifacts and builders
  • Loading branch information
mihaimaruseac authored Aug 30, 2022
2 parents 61c89dc + 0baa2d5 commit 369f948
Show file tree
Hide file tree
Showing 2 changed files with 249 additions and 22 deletions.
111 changes: 93 additions & 18 deletions pkg/assembler/assembler.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,43 +17,118 @@ package assembler

type assembler struct{} //nolint: unused

// Identifiable implements the ability to retrieve a set of
// attributes such that a graph query is able to identify a
// GuacNode or GuacEdge uniquely (or as a GuacHyperNode).
type Identifiable interface {
// Identifiers returns a map of fields and values which
// can be used to identify an object in the graph.
Identifiers() map[string]interface{}
}
// NOTE: `GuacNode` and `GuacEdge` interfaces are very experimental and might
// change in the future as we discover issues with reading/writing from the
// graph database.
//
// For now, the design of the interface follows these guidelines:
//
// 1. We want to serialize `GuacNode`s and `GuacEdge`s to graph database
// (e.g. Neo4j) without creating duplicate nodes. To do this, we need
// ability to uniquely identify a node. Since a node could be created from
// different document types, it can be uniquely identified by different
// subsets of attributes/properties. For example, we could have a node
// that is identified by an `"id"` field from one document and by the pair
// `"name"`, `"digest"` from another one.
// 2. Nodes can also have attributes that are not unique and are generated
// from various documents.
// 3. In order to write the serialization/deserialization code, we need to
// get the name of the attributes separate from the pairing between the
// attribute and the value.
//
// In broad lines, the serialization process for a node would look like:
//
// 1. For each identifiable set in `IdentifiablePropertyNames()` check if the
// node has values for all of the specified properties. If one is missing,
// try the next set. If no set is left, panic.
// 2. If a set of identifiable properties is found and we have values for all
// of these, write a query that would match on nodes which have these
// property:value attributes. The graph database engine will allow us to
// run separate code if a node already exists or one is newly created. In
// our case, in both instances we will just need to set the other
// attributes that have a value. To do this, the `Properties()` returned
// map will be passed directly to the prepared statement (which uses
// `Type()` to select the graph database node type and `PropertyNames()`
// to build the rest of the query).
//
// The serialization process for an edge would be similar, with the caveat that
// an edge is always created between two existing nodes.
//
// Deserialization is left for later, with the only caveat that we might
// envision a case where we'd like to match on edges without first matching on
// their endpoints (e.g., "retrieve all attestations from this time period and
// for each of them return the artifact nodes"). Hence, we need ways to
// uniquely identify edges without having endpoint nodes.
//
// TODO(mihaimaruseac): Look into using tags of fields to automate
// serialization/deserialization, similar to how json is done.

// GuacNode represents a node in the GUAC graph
// Note: this is experimental and might change. Please refer to source code for
// more details about usage.
type GuacNode interface {
Identifiable

// Type returns the type of node
Type() string

// Properties returns the list of properties of the node
Properties() map[string]interface{}

// PropertyNames returns the names of the properties of the node.
//
// If a string `s` is in the list returned by `PropertyNames` then it
// should also be a key in the map returned by `Properties`.
PropertyNames() []string

// IdentifiablePropertyNames returns a list of tuples of property names
// that can uniquely specify a GuacNode.
//
// Any string found in a tuple returned by `IdentifiablePropertyNames`
// must also be returned by `PropertyNames`.
IdentifiablePropertyNames() [][]string
}

// GuacEdge represents an edge in the GUAC graph
// Note: this is experimental and might change. Please refer to source code for
// more details about usage.
type GuacEdge interface {
Identifiable
// Type returns the type of edge
Type() string

// Nodes returns the (v,u) nodes of the edge
// where v--edge-->u for directional edges.
//
// For directional edges: v-[edge]->u.
// For non-directional edges there is no guaranteed order.
Nodes() (v, u GuacNode)

// Type returns the type of edge
Type() string

// Properties returns the list of properties of the node
// Properties returns the list of properties of the edge
Properties() map[string]interface{}

// PropertyNames returns the names of the properties of the edge.
//
// If a string `s` is in the list returned by `PropertyNames` then it
// should also be a key in the map returned by `Properties`.
PropertyNames() []string

// IdentifiablePropertyNames returns a list of tuples of property names
// that can uniquely specify a GuacEdge, as an alternative to the two
// node endpoints.
//
// Any string found in a tuple returned by `IdentifiablePropertyNames`
// must also be returned by `PropertyNames`.
//
// TODO(mihaimaruseac): We might not need this?
IdentifiablePropertyNames() [][]string
}

// AssemblerInput represents the inputs to add to the graph
type AssemblerInput struct {
// Subgraph represents a subgraph read from the database or written to it.
// Note: this is experimental and might change. Please refer to source code for
// more details about usage.
type Subgraph struct {
V []GuacNode
E []GuacEdge
}

// TODO(mihaimaruseac): Write queries to write/read subgraphs from DB?

// AssemblerInput represents the inputs to add to the graph
type AssemblerInput = Subgraph
160 changes: 156 additions & 4 deletions pkg/assembler/nodes.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,161 @@
package assembler

// ArtifactNode is a node that represents an artifact
// TODO: implement GuacNode
type ArtifactNode struct{}
type ArtifactNode struct {
Name string
Digest string
}

func (an ArtifactNode) Type() string {
return "Artifact"
}

func (an ArtifactNode) Properties() map[string]interface{} {
properties := make(map[string]interface{})
properties["name"] = an.Name
properties["digest"] = an.Digest
return properties
}

func (an ArtifactNode) PropertyNames() []string {
return []string{"name", "digest"}
}

func (an ArtifactNode) IdentifiablePropertyNames() [][]string {
// An artifact can be uniquely identified by digest
return [][]string{{"digest"}}
}

// AttestationNode is a node that represents an attestation
// TODO: implement GuacNode
type AttestationNode struct{}
type AttestationNode struct {
// TODO(mihaimaruseac): Unsure what fields to store here
FilePath string
Digest string
}

func (an AttestationNode) Type() string {
return "Attestation"
}

func (an AttestationNode) Properties() map[string]interface{} {
properties := make(map[string]interface{})
properties["filepath"] = an.FilePath
properties["digest"] = an.Digest
return properties
}

func (an AttestationNode) PropertyNames() []string {
return []string{"filepath", "digest"}
}

func (an AttestationNode) IdentifiablePropertyNames() [][]string {
// An attestation can be uniquely identified by filename?
return [][]string{{"filepath"}}
}

// BuilderNode is a node that represents a builder for an artifact
type BuilderNode struct {
BuilderType string
BuilderId string
}

func (bn BuilderNode) Type() string {
return "Builder"
}

func (bn BuilderNode) Properties() map[string]interface{} {
properties := make(map[string]interface{})
properties["type"] = bn.BuilderType
properties["id"] = bn.BuilderId
return properties
}

func (bn BuilderNode) PropertyNames() []string {
return []string{"type", "id"}
}

func (bn BuilderNode) IdentifiablePropertyNames() [][]string {
// A builder needs both type and id to be identified
return [][]string{{"type", "id"}}
}

// AttestationForEdge is an edge that represents the fact that an
// `AttestationNode` is an attestation for an `ArtifactNode`.
type AttestationForEdge struct {
AttestationNode AttestationNode
ArtifactNode ArtifactNode
}

func (e AttestationForEdge) Type() string {
return "Attestation"
}

func (e AttestationForEdge) Nodes() (v, u GuacNode) {
return e.AttestationNode, e.ArtifactNode
}

func (e AttestationForEdge) Properties() map[string]interface{} {
return map[string]interface{}{}
}

func (e AttestationForEdge) PropertyNames() []string {
return []string{}
}

func (e AttestationForEdge) IdentifiablePropertyNames() [][]string {
return [][]string{}
}

// BuiltByEdge is an edge that represents the fact that an
// `ArtifactNode` has been built by a `BuilderNode`
type BuiltByEdge struct {
ArtifactNode ArtifactNode
BuilderNode BuilderNode
}

func (e BuiltByEdge) Type() string {
return "BuiltBy"
}

func (e BuiltByEdge) Nodes() (v, u GuacNode) {
return e.ArtifactNode, e.BuilderNode
}

func (e BuiltByEdge) Properties() map[string]interface{} {
return map[string]interface{}{}
}

func (e BuiltByEdge) PropertyNames() []string {
return []string{}
}

func (e BuiltByEdge) IdentifiablePropertyNames() [][]string {
return [][]string{}
}

// DependsOnEdge is an edge that represents the fact that an
// `ArtifactNode` depends on another `ArtifactNode`
type DependsOnEdge struct {
ArtifactNode ArtifactNode
Dependency ArtifactNode
}

func (e DependsOnEdge) Type() string {
return "DependsOn"
}

func (e DependsOnEdge) Nodes() (v, u GuacNode) {
return e.ArtifactNode, e.Dependency
}

func (e DependsOnEdge) Properties() map[string]interface{} {
return map[string]interface{}{}
}

func (e DependsOnEdge) PropertyNames() []string {
return []string{}
}

func (e DependsOnEdge) IdentifiablePropertyNames() [][]string {
return [][]string{}
}

0 comments on commit 369f948

Please sign in to comment.