Skip to content

Commit

Permalink
Collection of memory management fixes.
Browse files Browse the repository at this point in the history
1. Save/load the memory allocation offset at the beginning and end of each externally-invoked transaction.
2. After tracing field contents, update field correctly when the reference already referenced storage.
  • Loading branch information
Michael Coblenz committed Dec 23, 2022
1 parent 2dd0b9f commit c686dda
Show file tree
Hide file tree
Showing 16 changed files with 143 additions and 85 deletions.
1 change: 1 addition & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion .idea/sbt.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion .idea/scala_compiler.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions Obsidian_Runtime/src/main/yul_templates/object.mustache
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ object "{{contractName}}" {
mstore(0x40, newFreePtr)
}
function free_last_allocation(size) {
mstore(0x40, sub(mload(0x40), size))
}
function panic_error_0x41() {
mstore(0, shl(224, 0x4e487b71))
mstore(4, 0x41)
Expand Down Expand Up @@ -55,6 +59,7 @@ object "{{contractName}}" {
{{! todo: is 4 a magic number or should it be generated based on the object in question? check the ABI }}
if iszero(lt(calldatasize(), 4)) {
{{invokeMain}}
{{memoryAllocationInit}}
{{! TODO 224 is a magic number offset to shift to follow the spec above; check that it's right }}
let selector := shr(224, calldataload(0))
{{dispatchTable}}
Expand Down Expand Up @@ -86,6 +91,10 @@ object "{{contractName}}" {
mstore(0x40, newFreePtr)
}

function free_last_allocation(size) {
mstore(0x40, sub(mload(0x40), size))
}

function panic_error_0x41() {
mstore(0, shl(224, 0x4e487b71))
mstore(4, 0x41)
Expand Down
83 changes: 51 additions & 32 deletions papers/yul_evm_writeup/benchmarks.tex
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
\subsection{Without Pointers}

\todo{overhead here but it's acceptable. optimizer does a lot. these
contracts don't touch memory, so their tracers are all the trivial
ones. this gives a sense of the baseline overhead we add.}
Obsidian imposes minimal overhead relative to Solidity, resulting in comparable costs.

\begin{figure}[hbtp]
\caption{Benchmarks with Trivial Tracers}
\caption{Gas Costs with Trivial Tracers}
\label{data.1}
\resizebox{\columnwidth}{!}{
\csvautotabular{small_bench.csv}
Expand All @@ -15,44 +13,65 @@ \subsection{Without Pointers}

\subsection{With Pointers}

\todo{lots of overhead; it's cheaper to turn off gc, because you're doing
more work collecting than you get repaid for}
When comparing Obsidian contracts that use pointers to Solidity, one must choose which implementation approach to take for the Solidity version. One approach is to use structs, which can be more efficient than Obsidian would be able to do, but in general could require the programmer to implement a memory allocator. The other approach is to texttt{new} operator to instantiate additional contracts, which can be substantially more expensive than Obsidian's approach. The deployment costs for Obsidian are larger primarily due to the memory management code that the compiler emits. We have not made any significant attempt to optimize that code beyond what the \texttt{--optimize} flag does. These benchmarks correspond with the test called \texttt{SetGetPointer} in the repository.

\begin{figure}[hbtp]
\caption{Benchmarks with Pointer Fields}
\label{data.2}
\resizebox{\columnwidth}{!}{
\csvautotabular{medium_bench.csv}
}
\end{figure}
\begin{table}
\caption{Gas costs with pointer fields}
\begin{tabular}{lll}
\toprule
& Deployment & Invocation \\
\midrule
Obsidian & 451490 & 41240 \\
Solidity, using a struct & 188599 & 22479 \\
Solidity, using \texttt{new} & 229136 & 218458 \\
\bottomrule
\end{tabular}
\end{table}


%\begin{figure}[hbtp]
% \caption{Benchmarks with Pointer Fields}
% \label{data.2}
% \resizebox{\columnwidth}{!}{
% \csvautotabular{medium_bench.csv}
% }
%\end{figure}


\subsection{Linked List}

\todo{the theory is that once you have enough in storage, the rebates take
over and are more than the cost of collecting them. these tests are with
a simple linked list of 4 and 8 items. if you imagine a doubly linked
list backing a priority / de / queue, it'll be faster. that datastructure
is known to be hard to implement in the sort of associative mappings that
are built into solidity, and its use might arise very naturally from a
smart contract that processes queries in arrival order.}
Currently, Obsidian requires that the programmer call \texttt{release} to free objects. This enables us to isolate the costs of the collection process and provided a way of measuring costs at the current stage of development, since the collector does not currently run automatically. We show benchmarks both with and without collection.

\begin{figure}[hbtp]
\caption{Benchmarks With Linked Lists}
\label{data.3}
\resizebox{\columnwidth}{!}{
\csvautotabular{ll_bench.csv}
}
\end{figure}
Collection is only worthwhile once there is enough data to be freed (zeroed); otherwise, the collection cost is not worth the gas. The cost of collection is particularly worthwhile in the case of complicated data structures that may be difficult to implement in Solidity, such as a linked list backing a priority queue.

The benchmarks below compare an Obsidian linked list implementation (LinkedListMed allocates eight nodes) to a pre-existing Solidity linked list implementation\footnote{https://medium.com/coinmonks/linked-lists-in-solidity-cfd967af389b} that has been adapted to work with Solidity 0.8.0. The repetitive code in LinkedListMed is because Obsidian currently lacks subtype polymorphism; this inflates the deployment cost significantly (because there are eight linked list node contracts instead of just one). The ``small'' test allocates four nodes instead of eight.

\todo{Note that differences in gas are roughly comparable to the ratio of
filesize differences.}
We were surprised at the high invocation gas costs of the Solidity linked list implementation. In the 4-node test, replacing the hash computation with a constant reduces invocation costs to 112527.

%\todo{the theory is that once you have enough in storage, the rebates take
% over and are more than the cost of collecting them. these tests are with
% a simple linked list of 4 and 8 items. if you imagine a doubly linked
% list backing a priority / de / queue, it'll be faster. that datastructure
% is known to be hard to implement in the sort of associative mappings that
% are built into solidity, and its use might arise very naturally from a
% smart contract that processes queries in arrival order.}

\begin{figure}[hbtp]
\caption{Sizes of Optimized GC and non-GC Linked List Benchmarks}
\label{data.4}
\caption{Gas Costs, Linked Lists}
\label{data.3}
\resizebox{\columnwidth}{!}{
\csvautotabular{sizes.csv}
\csvautotabular{ll_bench.csv}
}
\end{figure}

%
%\todo{Note that differences in gas are roughly comparable to the ratio of
% filesize differences.}
%
%\begin{figure}[hbtp]
% \caption{Sizes of Optimized GC and non-GC Linked List Benchmarks}
% \label{data.4}
% \resizebox{\columnwidth}{!}{
% \csvautotabular{sizes.csv}
% }
%\end{figure}
5 changes: 3 additions & 2 deletions papers/yul_evm_writeup/futurework.tex
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,14 @@ \subsection{Optimizations}
of future work.

In the currently emitted code, the translation of each transaction checks
the location of its instance before each and every interaction with either
the location of its instance before every interaction with either
memory or storage. The answer to this check is always the same within a
given execution of the transaction, so all but the first check is
redundant. A solution to this is to emit two Yul transactions per Obsidian
transaction: one that operates entirely on memory and one on storage. This
means that before calling a transaction you need to check once where its
instance lives, but after that there are no additional checks.
instance lives, but after that there are no additional checks. This would reduce invocation cost
but increase deployment cost.

The emitted code currently carries a fair amount of repeated code between
the deployment and invocation blocks. This could be reduced, since the full
Expand Down
8 changes: 3 additions & 5 deletions papers/yul_evm_writeup/ll_bench.csv
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
test name,gas used for deploy,gas used for invoke,total gas
LinkedListShort.obs,185438,26751,212189
LinkedListShortNoGC.obs,111895,26499,138394
LinkedListMed.obs,228890,27139,256029
LinkedListMedNoGC.obs,118162,26875,145037
Test,Obsidian depl.,Obsidian inv., Solidity depl., Solidity inv.
LinkedListShort,371531,44713,178377,392304
LinkedListMed,642939,56320,187893,406604
4 changes: 2 additions & 2 deletions papers/yul_evm_writeup/main.tex
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,8 @@ \section{Benchmarks}\label{sec:benchmarks}
\section{Future Work}\label{sec:futurework}
\input{futurework}

\section{Related Work}\label{sec:relatedwork}
\input{relatedwork}
%\section{Related Work}\label{sec:relatedwork}
%\input{relatedwork}

\section{Conclusion}\label{sec:conclusion}
\input{conclusion}
Expand Down
8 changes: 4 additions & 4 deletions papers/yul_evm_writeup/small_bench.csv
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
test name,gas used for deploy,gas used for invoke,total gas
AssignLocalAdd.obs,95101,21202,116303
PrimOpsEq.obs,90754,21175,111929
Return.obs,101152,21227,122379
Test,Obsidian depl.,Obsidian inv.,Solidity depl.,Solidity inv.
AssignLocalAdd,98265,21202,81737,21232
PrimOpsEq,91314,21183,79135,21213
Return,106044,21237,85409,21244
25 changes: 17 additions & 8 deletions resources/tests/GanacheTests/LinkedListShort.obs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@ contract IntLLEnd{
IntLLEnd@Owned(int p){
payload = p;
}

transaction sum() returns int {
return payload;
}
}

contract IntLL1{
Expand Down Expand Up @@ -49,13 +53,18 @@ contract IntLL3{
}
}

main contract LinkedListShort{
transaction main() returns int{
IntLLEnd tail = new IntLLEnd(0);
IntLL1 e1 = new IntLL1(1, tail);
IntLL2 e2 = new IntLL2(1, e1);
IntLL3 e3 = new IntLL3(1, e2);
e3.release();
return 0;
main contract LinkedListShort {
IntLLEnd@Owned list;

transaction LinkedListShort() {
}

transaction main() returns int {
list = new IntLLEnd(57005);
return list.sum();
}

transaction sum() returns int {
return list.sum();
}
}
7 changes: 4 additions & 3 deletions resources/tests/GanacheTests/tests.json
Original file line number Diff line number Diff line change
Expand Up @@ -306,7 +306,7 @@
},
{
"file": "LinkedListShort.obs",
"expected": "0",
"expected": "57005",
"shows_that_we_support": "shows that wipers follow the nested structure; benchmarking"
},
{
Expand All @@ -326,8 +326,9 @@
},
{
"file": "LinkedListMedSum.obs",
"expected": "255",
"shows_that_we_support": "shows that linked lists work"
"expected": "0",
"shows_that_we_support": "shows that linked lists work",
"trans": "main"
}
]
}
20 changes: 13 additions & 7 deletions src/main/scala/edu/cmu/cs/obsidian/codegen/CodeGenYul.scala
Original file line number Diff line number Diff line change
Expand Up @@ -306,13 +306,14 @@ object CodeGenYul extends CodeGenerator {
ExpressionStatement(apply("sstore", sto_loc, shift_if_addr)))

// the instructions to allocate memory for the log message and emit it, if set
// val nameLiteral: edu.cmu.cs.obsidian.codegen.Expression = edu.cmu.cs.obsidian.codegen.Literal(LiteralKind.string, name, "string")
val log: Seq[YulStatement] =
if (emit_logs) {
Seq(LineComment("logging"),
// allocate memory to log from
decl_1exp(log_temp, apply("allocate_memory", intlit(32))),
// load what we just wrote to storage to that location
ExpressionStatement(apply("mstore", log_temp, apply("sload", sto_loc))),
ExpressionStatement(apply("mstore", log_temp, apply("mload", mem_loc))),
// emit the log
ExpressionStatement(apply("log1", log_temp, intlit(32), sto_loc))
)
Expand Down Expand Up @@ -343,10 +344,12 @@ object CodeGenYul extends CodeGenerator {
}
}

val bailIfAlreadyInStorage = edu.cmu.cs.obsidian.codegen.If(compareToThresholdExp(Identifier("this")), Block(Seq(Leave())))

FunctionDefinition(name = nameTracer(name),
parameters = Seq(TypedName("this", YATAddress())),
returnVariables = Seq(),
body = Block(body :+ Leave()),
body = Block(bailIfAlreadyInStorage +: body :+ Leave()),
inDispatch = false
) +: others.distinctBy(fd => fd.name)
}
Expand Down Expand Up @@ -537,7 +540,7 @@ object CodeGenYul extends CodeGenerator {
* @return
*/
def translateStatement(s: Statement, retVar: Option[Identifier], contractName: String, checkedTable: SymbolTable): Seq[YulStatement] = {
s match {
val stmts = s match {
case Return() =>
Seq(Leave())
case ReturnExpr(e) =>
Expand Down Expand Up @@ -571,7 +574,8 @@ object CodeGenYul extends CodeGenerator {
// if it's primitive, check if it's a field or not and update or assign
case _: PrimitiveType =>
if (ct.allFields.exists(f => f.name.equals(x))) {
Seq(updateField(ct, Identifier("this"), x, id))
emitLog("primitive field assignment", 0xfa, Identifier("this")) ++
Seq(updateField(ct, Identifier("this"), x, id))
} else {
Seq(assign1(Identifier(x), id))
}
Expand All @@ -583,8 +587,8 @@ object CodeGenYul extends CodeGenerator {
// Need to assign BEFORE tracing because the tracer expects the field to be initialized.
// TODO: make this more efficient (avoid double-assignment to the field).
updateField(ct, Identifier("this"), x, id),
codegen.If(condition = compareToThresholdExp(fieldFromObject(ct, Identifier("this"), x)),
body = Block(Seq(ExpressionStatement(apply(nameTracer(contractType.contractName), id)),
codegen.If(condition = compareToThresholdExp(fieldFromObject(ct, Identifier("this"), x)), // if the field resides in storage...
body = Block(Seq(ExpressionStatement(apply(nameTracer(contractType.contractName), id)), // make sure the referenced object is also in storage.
updateField(ct, Identifier("this"), x, mapToStorageAddress(id))))), // Field should point to new address in storage


Expand Down Expand Up @@ -673,6 +677,8 @@ object CodeGenYul extends CodeGenerator {
assert(assertion = false, s"TODO: translateStatement unimplemented for ${s.toString}")
Seq()
}

LineComment(s.toString) +: stmts
}

// helper function for a common calling pattern below. todo: there may be a slicker way to do
Expand Down Expand Up @@ -775,7 +781,7 @@ object CodeGenYul extends CodeGenerator {
val ct = checkedTable.contractLookup(contractName)
if (ct.allFields.exists(f => f.name.equals(x))) {
val store_id = nextTemp()
Seq(decl_0exp(store_id),
emitLog("fetching field", 0xff, Identifier("this")) ++ Seq(decl_0exp(store_id),
fetchField(ct, x, store_id),
assign1(retvar, store_id))
} else {
Expand Down
3 changes: 2 additions & 1 deletion src/main/scala/edu/cmu/cs/obsidian/codegen/Util.scala
Original file line number Diff line number Diff line change
Expand Up @@ -595,7 +595,8 @@ object Util {
* @return an expression that computes to the corresponding address in storage
*/
def mapToStorageAddress(x: Expression): Expression = {
apply("add", x, storage_threshold)
// This relies on the representation of storage_threshold!
apply("or", x, storage_threshold)
}

/** the returned expression evaluates to true iff the argument is above or equal to the storage threshold
Expand Down
Loading

0 comments on commit c686dda

Please sign in to comment.