Hoon Basics: Runes | Basic Programs

Overview

Teaching: 30 min
Exercises: 15 min

Questions

How are Hoon programs structured?

What are runes?

Objectives

Identify elements of Hoon such as runes, atoms, and cells.

Diagram Hoon generators into the corresponding abstract syntax tree.

Operate Dojo by entering Hoon commands.

Mount the Unix filesystem and commit changes as necessary.

Use built-in or provided tools (gates and generators).

Compose a simple generator and load it from gen/.

Understand how to pass arguments to and results from gates.

Produce a generator for a mathematical calculation.

Urbit specifies a virtual machine with a minimalist instruction set called Nock. Nock is Turing-complete but far too involved to code directly in, leading to the specification of Hoon as a macro language over native Nock. Hoon affords a clean if alien approach to software construction, relying on a system of “runes” which structure your code as a binary tree. Programming Hoon can feel Lisp-like, but has a number of its own quirks.

What we don’t need to do in this workshop is motivate Urbit adoption: you’re invested enough to be here, which means that you perceive some unique benefits to working on the platform. We are going to focus on the how and to a lesser extent the why of programming on Urbit.

Our first goal today is for you to be able to

Identify Hoon runes and children in both inline and long-form syntax.
Trace a short Hoon expression to its final result.
Execute Hoon code within a running ship.
Produce output as a side effect.

To do this, we need to examine a short Hoon program to see what Hoon looks like, then compose a new program of our own.

Creating a Fakezod

We will set up a fakezod and a backup. Since an Urbit ship has a persistent session (as perceived by itself), it is frequently helpful to toss out a lobotomized pier and start anew. Because this process takes a while the first time we do it locally,

Navigate to a new directory /home/username/urbit. (If you work somewhere else, you are completely responsible for transposing anything we do that depends on directory paths.)
Download the pill, or pre-packaged Urbit OS configuration: wget https://bootstrap.urbit.org/urbit-v1.5.pill.
Start a new fakezod in this directory with urbit -F zod -B urbit-v1.5.pill. The boot sequence will begin. Let this run in the background while we talk.
When this is done, press Ctrl+D to exit the Urbit session.
Copy the fakezod to a local backup: cp -r zod zod-ist-tot.

Reading the Runes

For instance, consider the following Python program, which implements the Fibonacci sequence as a list:

def fibonacci(n):
  i = 0
  p = 0
  q = 1
  r = []
  while (i < n):
    i = i + 1
    po = p
    p = q
    q = po + q
    r.append(q)

This is a targum of Hoon rephrased into English-like pseudocode. (Very early on there was a standard version of Hoon not unlike this, although it’s not been used for many years now.)

func  n=uint  {
  let  [i=0 p=0 q=1 r=(list uint)]
  do {
    if (i == n)  return r
  } loop {
    i ← i+1      :: increment counter
    p ← q        :: cycle value forwards
    q ← (p + q)  :: add values
    r ← [q] + r  :: append to list
  }
  match-type  r  :: enforce type restraint
  call  flop     :: reverse the list order before returning
}

Compare the verbal Hoon program side-by-side with the Python program.

|=  n=@ud
%-  flop
=+  [i=0 p=0 q=1 r=*(list @ud)]
|-  ^+  r
?:  =(i n)  r
%=  $
  i  +(i)
  p  q
  q  (add p q)
  r  [q r]
==

The terminology used in Urbit and Hoon is often unfamiliar. Sometimes this means that you are dealing with a truly new concept (which would be obscured by overloading an older word like “subroutine” or “function”), whereas sometimes you are dealing with an internal aspect that doesn’t really map well to other systems. The strangeness can be frustrating. The strangeness can make concepts fresh again. You’ll experience both sentiments during this workshop.

Each rune accepts zero or more children. Most runes accept a definite number of children, but a few can accept a variable number; these use == tistis or -- hephep digraphs to indicate termination of the running series. We separate adjacent children by gaps, or sequences of whitespace longer than a single space ace.

Let’s run this Fibonacci program in Urbit. You will need to start a fakezod, at which point we have two options:

The Dojo REPL, which offers some convenient shortcuts to modify the subject for subsequent commands.
A tight loop of text editor and running a “generator”.

Dojo REPL

To input this program directly into Dojo, we will use a shortcut to name this code; in Hoon-speak we say we give it a face. You should copy and paste the program rather than typing it out:

=fib |=  n=@ud
%-  flop
=+  [i=0 p=0 q=1 r=*(list @ud)]
|-  ^+  r
?:  =(i n)  r
%=  $
  i  +(i)
  p  q
  q  (add p q)
  r  [q r]
==

Notice that the Dojo parses your code for compatibility in real time. This makes the typing a bit slow and janky but means that it is fairly difficult to input invalid code or syntax errors. (Nothing is impossible, of course!) If at any point you are typing and nothing appears, it is because you aren’t following a rule of valid Hoon.

Now whenever we want to run the gate, we can use a Lisp-like syntax to operate on a value:

(fib 15)

Generator

The foregoing method works reasonably well when testing short snippets out, but is impractical and doesn’t scale well. What we need to be able to do is put the code into the running ship with a name attached so we can locate it, build it, and evaluate it.

If we cd into the ship’s pier in Unix and ls the directory contents, by default we see nothing. With ls -l, a .urb/ directory containing the ship’s configuration and contents in obfuscated format becomes visible. This directory is not interpretable by us now, so we leave it until a later discussion of the Urbit binary. Mars doesn’t know about Earth: we can’t directly edit the code in Mars, but instead have to edit on Earth and then synchronize to Mars.

In the fakezod, run |mount %home to mount the %clay filesystem to the Unix host filesystem. (It’s actually helpful to do this before you make a backup, or to go ahead and do it pre-emptively in the backup.)
Open a text editor and paste the program. Save this file in zod/home/gen/ as fib.hoon.
In the fakezod, tell the runtime to synchronize Earth-side (Unix-side) changes back into %clay: |commit %home.
Run the generator in the Dojo as +fib 15.

Note the different syntax. In the first case, we have a face in the operating context or subject and we can invoke it directly as a function. In the latter case, we have to tell Dojo that there is a text file somewhere with a given name, and that it should locate it, build it, pass in the arguments, and evaluate it.

Charting Rune Children

Let us read the runes by charting out the children of each rune vertically:

|=
  n=@ud
  %-
    flop
    =+
      [i=0 p=0 q=1 r=*(list @ud)]
      |-
        ^+
          r
          ?:
            =(i n)
            r
            %=
              $
              i
              +(i)
              p
              q
              q
              (add p q)
              r
              [q r]
            ==

A Generator

Let us compose a short generator from scratch.

Project Euler Problem #1

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

Conceptually, we need to build a gate which accepts some number n and yields the sum of the values that meet this criterion.

If we start from the Fibonacci sequence gate, we can build a list of acceptable values, then figure out how to sum the values in the list. Let’s start by building the list then, and introducing new functions or runes if we need them.

Copy fib.hoon to a new file.
Scan down, top to bottom, and see what needs to be altered or removed.
Test by |commit %home then +p1 10

|=  n=@ud                   :: gate accepts a single @ud argument
%-  roll  :_  add
=+  [i=3 m=0 r=*(list @ud)]
|-  ^+  r
?:  =(i n)  r
?:  =((mod i 3) 0)
  %=  $
    i  +(i)
    r  [i r]
  ==
?:  =((mod i 5) 0)
  %=  $
    i  +(i)
    r  [i r]
  ==
%=  $
  i  +(i)
==

Once you can put this together, the next step is to sum the values together. Hoon is a functional language, which means that it prefers to express set operations. In this case, ++roll will help us. Add this as the second line:

%-  roll  :_  add

(As a standalone, ++roll works like this:

(roll `(list @ud)`~[9 6 5 3] add)

What is :_ colcab doing here?)

Irregular Forms

Many runes in common currency are not written in their regular form (tall or wide), but rather using syntactic sugar as an irregular form.

For instance, %- is most frequently written using parentheses () which permits a Lisp-like calling syntax:

(add 1 2)

is equivalent to

%-  add  [1 2]

which in turn is also equivalent to

%-(add [1 2])

All three mode of expression are encountered in production code. Developers balance expressiveness, comprehensibility, and pattern-matching in deciding how lapidary to compose a runic expression. Many extremely common patterns were soon subsumed by a sugar rune.

Abstract Syntax Tree (Optional)

Hoon parses to an abstract syntax tree (AST), which includes cleaning up all of the sugar syntax and non-primitive runes. To see the AST of any given Hoon expression, use !, zapcom, against *hoon.
> !,(*hoon =/(n 4 +(n)))
[%tsfs p=term=%n q=[%sand p=%ud q=4] r=[%dtls p=[%wing p=~[%n]]]]

Key Points

Mars (Urbit) is hermetically sealed from Earth except for system calls.

Hoon structures all programs as binary trees of Nock.

The %clay vane acts as a file system for Urbit.

Hoon Basics: Data Structures

Overview

Teaching: 20 min
Exercises: 10 min

Questions

How does Hoon represent data values and structures?

How can I compose and run multi-line programs?

Objectives

Apply auras to transform an atom.

Identify common Hoon molds, such as lists and tapes.

Produce a generator to convert a value between auras.

Identify common Hoon patterns: atoms, cells, cores, faces, and traps.

Nouns

All values in Urbit are nouns, meaning either atoms or cells. An atom is an unsigned integer. A cell is a pair of nouns.

Atoms

An atom is an unsigned integer of any size. (It may be fruitful for you to think of an atom as being like a Gödel number or a binary machine value.) Since all values are ultimately (collections of) integers, we need a way to tell different “kinds” (or applications) of integers apart.

Atoms have tags called auras which can be used coercively or noncoercively to represent single-valued data. In other words, an aura is a bit of metadata Hoon attaches to a value which tells Urbit how you intend to use a number. The default aura for a value is @ud, unsigned decimal, but of course there are many more. Aura operations are extremely convenient for converting between representations. They are also used to enforce type constraints on atoms in expressions and gates.

For instance, to a machine there is no fundamental difference between binary 0x11011001, decimal 217, and hexadecimal 0xD9. A human coder recognizes them as different encoding schemes and associates tacit information with each: an assembler instruction, an integer value, a memory address. Hoon offers two ways of designating values with auras: either directly by the input formatting of the number (such as 0b1101.1001) or using the irregular syntax @:

0b1101.1001
`@ud`0b1101.1001  :: yields 217
`@ux`0b1101.1001  :: yields 0xd9

Auras

Try the following auras. See if you can figure out how each one is behaving.

`@ud`0x1001.1111
`@ub`0x1001.1111
`@ux`0x1001.1111
`@p`0x1001.1111

`@ud`.1
`@ux`.1
`@ub`.1
`@sb`.1

`@p`0b1111.0000.1111.0000

`@ta`'hello'
`@ud`'hello'
`@ux`'hello'
`@uc`'hello'
`@sb`'hello'
`@rs`'hello'

The atom/aura system represents all simple data types in Hoon: dates, floating-point numbers, text strings, Bitcoin addresses, and so forth. Each value is represented in least-significant byte (LSB) order; for instance, a text string may be deconstructed as follows:

Urbit
`'Urbit'`
`0b111.0100.0110.1001.0110.0010.0111.0010.0101.0101`
`0b111.0100` `0b110.1001` `0b110.0010` `0b111.0010` `0b101.0101`
`116` `105` `98` `114` `85` (ASCII characters)
`t` `i` `b` `r` `U`

Note in the above that leading zeroes are always stripped. Since each atom is an integer, there is no way to distinguish 0 from 00 from 000 etc.

In this vein, it’s worth remembering that Dojo automatically parses any typed input and disallows invalid representations. This can lead to confusion until you are accustomed to the type signatures; for instance, try to type 0b0001 into Dojo.

Operators

Hoon has no primitive operators. Instead, aura-specific functions or gates are used to evaluate one or more atoms to produce basic arithmetic results. Gate names are conventionally prefixed with ++ luslus which designates them as arms of a core. (More on this terminology in a future section.) Some gates operate on any input atom auras, while others enforce strict requirements on the types they will accept. Gates are commonly invoked using a Lisp-like syntax and a reverse-Polish notation (RPN), with the operator first followed by the first and second (and following) operands.

Operation	Function	Example
Addition	`++add`	`(add 1 2)` → `3`
Subtraction	`++sub`	`(sub 4 3)` → `1`
Multiplication	`++mul`	`(mul 5 6)` → `30`
Division	`++div`	`(div 8 2)` → `4`
Modulus/Remainder	`++mod`	`(mod 12 7)` → `5`

Following Nock’s lead, Hoon uses loobeans (0 = true) rather than booleans for logical operations. Loobeans are written %.y for true, 0, and %.n for false, 1.

Operation	Function	Example
Greater than, >	`++gth`	`(gth 5 6)` → `%.n`
Greater than or equal to, ≥	`++gte`	`(gte 5 6)` → `%.n`
Less than, <	`++lth`	`(lth 5 6)` → `%.y`
Less than or equal to, ≤	`++lte`	`(lte 5 6)` → `%.y`
Equals, =	`=`	`=(5 5)` → `%.y`
Logical `AND`, ∧	`&`	`&(%.y %.n)` → `%.n`
Logical `OR`, ∨	`\|`	`\|(%.y %.n)` → `%.y`
Logical `NOT`, ¬	`!`	`!%.y` → `%.n`

Since all operations are explicitly invoked Lisp-style within nested parentheses, there is no need for explicit operator precedence rules.

[(a < b) \land ((b \geq c) \lor d)]

&((lth a b) (|((gte b c) d)))

The Hoon standard library, largely in defined in %zuse, further defines bitwise operations, arithmetic for both integers and floating-point values (half-width, single-precision, double-precision, and quadruple-precision), string operations, and more.

Cells

All structures in Nock are binary trees; thus also Hoon. This can occasionally lead to some awkward addressing when composing tetchy library code segments that need to interface with many different kinds of gates, but by and large is an extremely helpful discipline of thought.

For instance, a list of characters (or tape, one of Hoon’s string types) can be addressed directly by the rightward-branching cell addresses or via a convenience notation (which is more intuitive). (We’ll see more of this in a moment when we discuss text data structures.)

> +1:"hello"
i="hello"
> +2:"hello"
i='h'
> +3:"hello"
t="ello"
> +4:"hello"
dojo: hoon expression failed
> +5:"hello"
dojo: hoon expression failed
> +6:"hello"
i='e'
> +7:"hello"
t="llo"
> +15:"hello"
t="l"
> +16:"hello"
t="lo"
> +30:"hello"
t="l"
> +31:"hello"
t="o"
> +62:"hello"
i='o'
> +63:"hello"
t=""
> +127:"hello"
dojo: hoon expression failed
>
>
>
> &1:"hello"
i='h'
> &2:"hello"
i='e'
> &3:"hello"
i='l'
> &4:"hello"
i='l'
> &5:"hello"
i='o'

Cell Construction

Produce a cell representation of the fruit tree. You may use the following Unicode strings as @t cords:

‘🍇’

‘🍌’

‘🍉’

‘🍏’

‘🍋’

‘🍑’

‘🍊’

‘🍍’

‘🍒’

Use ~ as a placeholder for an empty node when necessary.
Solution
[[[['🍏' ~] '🍇'] [[~ ['🍑' [~ '🍍']]] '🍌']] [['🍉' [~ '🍋']] [~ [~ ['🍊' ['🍒' ~]]]]]]

More common than direct numerical addressing (of either form), however, is lark addressing, which is a quirky but legible shorthand. The head and tail of each cell are selected by alternating +/- and </> pairs, which is readable once you know what you’re looking at.

> =hello "hello"
> -.hello
i='h'
> +.hello
t="ello"
> -<.hello
dojo: hoon expression failed
> ->.hello
dojo: hoon expression failed
> +<.hello
i='e'
> +>.hello
t="llo"
> +>-.hello
i='l'
> +>+.hello
t="lo"
> +>+<.hello
i='l'
> +>+>.hello
t="o"
> +>+>-.hello
i='o'
> +>+>+.hello
t=""

Finally, the most general mold is * which simply matches any noun—and thus anything in Hoon at all. The * applied to a value yields the bunt, or default empty definition.

> *@ud
0
> *@ux
0x0
> *add
0
> *mul
1

Hoon as Nock Macro (Optional)

The point of employing Hoon is, of course, that Hoon compiles to Nock. Rather than even say compile, however, we should really just say Hoon is a macro of Nock. Each Hoon rune, data structure, and effect corresponds to a well-defined Nock primitive form. We may say that Hoon is to Nock as C is to assembler, except that the Hoon-to-Nock transformation is completely specified and portable. Hoon is ultimately defined in terms of Nock; many Hoon runes are defined in terms of other more fundamental Hoon runes, but all runes parse unambiguously to Nock expressions.

Hoon also expands on Nock through the introduction of metadata, such as auras and core annotations. The Hoon compiler enforces conventions to aid the programmer.

Each Hoon rune has an unambiguous mapping to a Nock representation. Furthermore, each rune has a well-defined binary tree structure and produces a similarly well-structured abstract syntax tree (AST). As we systematically introduce runes, we will expand on what this means in each case: for now, let’s examine a rune and its Nock equivalent.
|= bartis produces a gate or function. Every gate has the same shape, which means certain assumptions about data access and availability can be made.
::  XOR two binary atoms
|=  [a=@ub b=@ub]
`@ub`(mix a b)
maps to the Nock code
[8 [1 0 0] [1 8 [9 1.494 0 4.095] 9 2 10 [6 [0 28] 0 29] 0 2] 0 1]
We call Hoon’s data type specifications molds. Molds are more general than atoms and cells, but these form particular cases. Hoon uses molds as a way of matching Nock tree structures (including Hoon metadata tags such as auras).

Some Data Structures

Lists

A list is a binary tree which branches rightwards. The tape, which we saw earlier, is a special case of this applying to single characters.

A lest is a special case of list: one guaranteed to be non-null. (Certain operations require the stronger guarantee that a list has some content.)

Text

Hoon recognizes two basic text types: the cord or @t and the tape. Cords are single atoms containing the text as UTF-8 bytes interpreted as a single stacked number. Tapes are lists of individual one-element cords. (Lists are null-terminated, and thus so are tapes.)

> +1:"hello"
i="hello"
> +2:"hello"
i='h'
> +3:"hello"
t="ello"
> +4:"hello"
dojo: hoon expression failed
> +5:"hello"
dojo: hoon expression failed
> +6:"hello"
i='e'
> +7:"hello"
t="llo"
> +15:"hello"
t="l"
> +16:"hello"
t="lo"
> +30:"hello"
t="l"
> +31:"hello"
t="o"
> +62:"hello"
i='o'
> +63:"hello"
t=""
> +127:"hello"
dojo: hoon expression failed

Lists have an additional way to grab an element:

Sequential entry, & lets you grab the _n_th item of a list: &1:~['one' 2 .3.0 .~4.0 ~.5]

> &1:"hello"
i='h'
> &2:"hello"
i='e'
> &3:"hello"
i='l'
> &4:"hello"
i='l'
> &5:"hello"
i='o'

In contrast to tapes, cords store the ASCII values in a single arbitrary integer in LSB order.

(More properly, it’s done in UTF-8: here’s Cherokee.)

> 'ᏣᎳᎩ'
'ᏣᎳᎩ'
> `@ux`'ᏣᎳᎩ'
0xa9.8ee1.b38e.e1a3.8fe1
> `@ub`'ᏣᎳᎩ'
0b1010.1001.1000.1110.1110.0001.1011.0011.1000.1110.1110.0001.1010.0011.1000.1111.1110.0001
> :: 0xa9.8ee1  0xb3.8ee1  0xa3.8fe1

Cords are useful as a compact primary storage and data transfer format, but frequently parsing and processing involves converting the text into tape format. There are more utilities for handling tapes, as they are already broken up in a legible manner.

For instance, ++trip converts a cord to a tape; ++crip does the opposite.

++  trip
  |=  a=@  ^-  tape
  ?:  =(0 (met 3 a))  ~
  [^-(@ta (end 3 1 a)) $(a (rsh 3 1 a))]

++met, ++end, and ++rsh are bitwise manipulation gates. Basically they chop up a cord into $2^3 = 8$-bit slices as the elements of a list.

++  crip
  |=  a=tape  ^-  @t
  (rap 3 a)

++rap assembles the list interpreted as cords with block size of $2^3 = 8$ (in this case).

Unicode in Urbit

All text in Urbit is UTF-8 (and typically just 8-bit ASCII). The @c UTF-32 aura is only used by the keyboard vane %dill and Hood (the Dojo terminal agent).

Sudan Function

Implement the Sudan function in Hoon.
\[\begin{array}{lcl} F_0 (x, y) & = x+y \\ F_{n+1} (x, 0) & = x & \text{if } n \ge 0 \\ F_{n+1} (x, y+1) & = F_n (F_{n+1} (x, y), F_{n+1} (x, y) + y + 1) & \text{if } n\ge 0 \\ \end{array}\]
++  sudan
  |=  [n=@ud x=@ud y=@ud]  ^-  @ud
  ?:  =(n 0)  (add x y)
  ?:  =(y 0)  x
  $(n (dec n), x $(n n, x x, y (dec y)), y (add y $(n n, x x, y (dec y))))

Key Points

All types in Urbit are unsigned integers @ud.

Molds provide a structured type system for Hoon.

All data in Hoon are binary trees; thus, common patterns arise again and again.

Hoon Basics: Subject-Oriented Programming

Overview

Teaching: 30 min
Exercises: 15 min

Questions

How are Hoon programs structured?

What are runes?

Objectives

Explain what a “subject-oriented language” means.

Identify common Hoon patterns: batteries, and doors, arms, wings, and legs.

Create a %say generator.

Review known runes in context of highest-frequency, highest-impact runes.

Subject-Oriented Programming

Urbit adopts an innovative programming paradigm called “subject-oriented programming.” By and large, Hoon (and Nock) is a functional programming language in that However, Hoon also very carefully bounds the known context of any part of the program as the subject. Basically, the subject is the noun against which any arbitrary Hoon code is evaluated.

For instance, when we first composed generators, we made what are called “naked generators”: that is, they do not have access to any information outside of the base subject (Arvo, Hoon, and %zuse) and their sample (arguments). Other generators (such as %say generators, described below) can have more contextual information, including random number generators and optional arguments, passed to them to form part of their subject.

Hoon developers frequently talk about “limbs” of the subject. Arms describe known labeled address (with ++ luslus) which carry out computations. Legs are limbs which store data.

Addressing Redux

Labelled arms are invoked as e.g. gates on data. However, if you know the address of a particular value in a limb, you can retrieve it directly using either notation:

Numbered limb, +/: lets you grab a specific numbered arm: +7:[1 2 3]

Lark notation, +/-/</> lets you select by relative address: +>:[1 2 3]

(One challenge of navigation in the current Dojo is that Urbit developer tools return details like hashes of arms rather than supervening labels.)

Addressing the Fruit Tree

Produce the numeric and lark-notated equivalent addresses for each of the following nodes in the binary fruit tree:

🍇

🍌

🍉

🍏

🍋

🍑

🍊

🍍

🍒

Solution

🍇 9 or -<+

🍌 11 or ->+

🍉 12 or +<-

🍏 16 or -<-<

🍋 27 or +<+>

🍑 42 or ->->-

🍊 62 or +>+>-

🍍 87 or ->->+> # heuristic for these mathematically

🍒 126 or +>+>+<

“The Subject and its Legs”

Cores and Derived Structures

Cores

The core is the primary nontrivial data structure of Nock: atoms, cells, cores. A core is defined as a cell of [battery payload]; in the abstract this simply divides the battery or code from the payload or data. Cores can be thought of as similar to objects in object-oriented programming languages, but possess a completely standard structure which allows for detailed introspection and “hot-swapping” of core elements. Everything in standard Hoon and Arvo that cannot be reduced to an atom or a cell is de facto a core. (Indeed, if one wished to separate code and data in a binary tree structure, the only other logical choice one would have available is to flip the order of battery and payload.)

In short a battery is a collection of arms and a payload is a collection of data, possibly from various sources. (At this point the “leg” terminology breaks down a bit, and in practice mostly people talk about “arms.”)

Conventionally, most cores are either produced by the |% barcen rune or are instances of a more complex form (such as a door). Cores are a “live” type; they are not simply holders of data but are expected to operate on data, just as it is uncommon to see a C++ object which only holds data. (The role of a C struct is approximated by the $% buccen tagged union.)

Some terminology is in order, to be expanded on subsequently:

An arm is a Hoon expression to be evaluated against the core subject (i.e. its parent core is its subject).
A leg is a data value.
A limb is either an arm or a leg: formally, “an attribute of a subject.” A limb is resolved with
A wing is a resolution path pointing to a limb. It’s a search path, like ``

Altogether, this yields a rich introspective framework for accessing and manipulating cores. We won’t do a lot with it, but if you are interested, look up core genericity and variadicity in the docs.

Shadowing Names (Optional)

In any programming paradigm, good names are valuable and collisions are likely. In Hoon, if you need to catch an outer-context label that has the same name as an inner-context value, use ^ ket to skip the depth-first match.
^^json

Limb Resolution Paths

While the docs on limbs contain a wealth of information on how limbs are resolved by the Hoon compiler, it is worth addressing in brief the two common resolution tools you will encounter today: . dot and : col.

. dot resolves the wing path into the current subject.

: col resolves the wing path with the right-hand-side as the subject.

Operators (Optional)

There’s a lot going on with addressing. In practice, it’s typically readable but subtleties abound.

+ lus is the slot operator.

& pam is the head of a cell.

| bar is the tail of a cell.

. dot is the wing resolution path into the subject (using Nock Zero).

=/(foo [a=3 b=4] b.foo) is 4, because one adds foo=[a=3 b=4] to the subject.

: col is the right-associative search path into the right-hand side as subject (using Nock Seven). (This is also the irregular syntax of +< tisgal.)

=/(foo [a=3 b=4] b:foo) is 4, because one computes the wing foo against the subject beneath the =/ and then using that as the subject for the computing the wing b.

+ lus/- hep/< gal/> gar are the lark notation symbols.

“Arms and Cores”

Traps

The trap creates the basic looping mechanism in Hoon, a special instance of a core which is capable of concisely recursing into itself. The trap is formally a core with one arm named $ buc, and it is either created with |- barhep (for instant evaluation) or |. bardot (for deferred evaluation).

In practice, traps are used almost exclusively as recursion points, much like loops in an imperative language. For instance, the following program counts from 5 down to 1, emitting output via ~& sigpam at each iteration, then return ~ sig.

=/  count  5
|-
  ?:  =(count 0)  ~
~&  count
$(count (dec count))

The final line $(count (dec count)) serves to modify the subject (at count) then to pull the $ buc arm again. In practice the tree unrolls as follows, with indentation indicating code running “inside” of another rune.

Gates

Similar to how a trap is a core with one arm, a gate is a core with one arm and a sample. This means that new per-invocation data are available in the gate’s subject.

A gate is a core [battery payload] with one arm $ buc in the battery and a payload consisting of [sample context]; thus, [$ [sample context]]. We’ve encountered gates in generators before (which Dojo knows how to connect), but very frequently they are implemented as arms in a larger core.

When you invoke a gate using %- cenhep (what is the irregular form?), the $ buc arm is pulled and the sample is passed in to it.

Gates—and other cores—have a standard structure, which renders them valid objects of introspection. Indeed, Lisp-style brain surgery on cores is not uncommon, although we will by and large not need it in this workshop.

$:add
$:mul
$:arch

A trap isn’t the only way to recurse in Hoon. In fact, one sees the default $ buc arm invoked directly in many cases to recalculate an entire gate recursively. Consider, for instance, this implementation of the factorial:

|=  n=@ud
?:  =(n 1)
  1
(mul n $(n (dec n)))

In this case, the $ buc arm is directly invoked in the ++mul itself. When the $() recursion point is reached, the entire $ buc arm is re-evaluated with the specified change of n → n-1.

That is, when one reads this program, one reads it falling into two components:

|=  n=@ud             :: accept a single value n for n!
?:  =(n 1)  1         :: check n ≟ 1; if so, return 1
(mul n $(n (dec n)))  :: multiply n times the product of this arm w/ n-1

“Gates”

Doors

A gate is a particular instance of a core which is “automatically executable.” A door is a more general instance of a gate: really, it’s a gatemaker. It produces gates as necessary.

When using a gate, the calling convention replaces the sample then pulls the $ buc arm. In contrast, when using a door, the calling convention replaces the sample then pulls which arm has been requested.

For instance, the random number generator uses the system entropy eny to produce a random number in the range 1–100:

(~(rad og eny) 1.000)

To build a door, use the |_ barcab rune to produce a core with multiple arms (and no default $ buc arm). A running program agent is a door, so we will work more with these incidentally tomorrow.

This is part of the digit-to-cord display door:

++  ne
  |_  tig=@
  ++  d  (add tig '0')
  ++  x  ?:((gte tig 10) (add tig 87) d)
  ++  v  ?:((gte tig 10) (add tig 87) d)
  ++  w  ?:(=(tig 63) '~' ?:(=(tig 62) '-' ?:((gte tig 36) (add tig 29) x)))
  --

++ne is used as follows:

`@t`~(d ne 7)   :: decimal digit as cord
`@t`~(x ne 14)  :: hexadecimal digit as cord
`@t`~(v ne 25)  :: base-32 digit as cord
`@t`~(w ne 52)  :: base-64 digit as cord

Custom Types

Doors and other cores frequently include custom type definitions; these are discussed below in “Molds”.

Manipulating the Sample Directly (Optional)

Since . dot refers to the subject, this yields the ability to manipulate the sample without calling any arm directly. The following examples illustrate:
(add [1 5])               :: call a gate
6
~($ add [1 5])            :: call the $ arm in the door
6
~(. add [1 5])            :: update the sample (arguments) but don't evaluate
<1.otf [[@ud @ud] <45.xig 1.pnw %140>]>
=<  $  ~(. add [1 5])     :: call the $ arm of the door with updated sample
6
=/  addd  ~(. add [1 5])  :: do the same thing but in two steps
  =<  $  addd
6

Code as Specification (Optional)

At this point, we need to step back and contextualize the power afforded by the use of cores. In another language, such as C or Python, we specify a behavior but have relatively little insight into the instantiation effected by the compiler.

Python as written in the interpreter:

metals = ['gold', 'iron', 'lead', 'zinc']
total = 0
for metal in metals:
    total = total + len(metal)

Python as compiled bytecode:

1          0 LOAD_CONST               0 ('gold')
           2 LOAD_CONST               1 ('iron')
           4 LOAD_CONST               2 ('lead')
           6 LOAD_CONST               3 ('zinc')
           8 BUILD_LIST               4
          10 STORE_NAME               0 (metals)

2         12 LOAD_CONST               4 (0)
          14 STORE_NAME               1 (total)

3         16 LOAD_NAME                0 (metals)
          18 GET_ITER
       20 FOR_ITER                16 (to 38)
          22 STORE_NAME               2 (metal)

4         24 LOAD_NAME                1 (total)
          26 LOAD_NAME                3 (len)
          28 LOAD_NAME                2 (metal)
          30 CALL_FUNCTION            1
          32 BINARY_ADD
          34 STORE_NAME               1 (total)
          36 JUMP_ABSOLUTE           20
       38 LOAD_CONST               5 (None)
          40 RETURN_VALUE

Many popular programming languages specify behavior rather than implementation. By specifying Hoon as a macro language over Nock, the Urbit developers collapse this distinction.

For Hoon (like Lisp), the structure of execution is always front-and-center, or at least only thinly disguised. When one creates a core with a given behavior, one can immediately envision the shape of the underlying Nock. This affords one immense power to craft efficient, effective, graceful programs.

=/  metals  `(list tape)`~["gold" "iron" "lead" "zinc"]
|=  [metals=(list tape)]
=/  count  0
=/  total  0
|-
  ?:  =(count (lent metals))  total
  =/  metal  `tape`(snag count metals)
$(total (add total (lent metal)), count +(count))

!=  |=  [metals=(list tape)]  =/  count  0  =/  total  0  |-  ?:  =(count (lent metals))  total  =/  metal  `tape`(snag count metals)  $(total (add total (lent metal)), count +(count))
[ 8
  [1 0]
  [ 1
    8
    [1 0]
    8
    [1 0]
    8
    [ 1
      6
      [5 [0 14] 8 [9 343 0 32.767] 9 2 10 [6 0 126] 0 2]
      [0 6]
      8
      [8 [9 20 0 32.767] 9 2 10 [6 [0 30] 0 126] 0 2]
      9
      2
      10
      [14 4 0 30]
      10
      [ 6
        8
        [9 36 0 131.071]
        9
        2
        10
        [6 [0 30] 7 [0 3] 8 [9 343 0 65.535] 9 2 10 [6 0 6] 0 2]
        0
        2
      ]
      0
      3
    ]
    9
    2
    0
    1
  ]
  0
  1
]

Molds

Classically, computation was unified by an early and elegant conception of code-as-data, particularly in Lisp but hearkening back to Gödel. Nock knows only about unsigned integers and mechanically applies rules over structures consisting only of unsigned integer values and cells (together, “nouns”). However, for human-legibility we would like more sophisticated type annotation, whether coercive or purely in metadata.

The aura is a particular example of a mold, the type enforcement mechanism in Hoon. A mold is a specific type definition, customarily defined with a |% core. We commonly see three runes supporting this structure:

+$ lusbuc creates a type constructor arm to define and validate type definitions.
$% buccen creates a collection of named values (type members).
$= bucwut defines a union, a set validating membership across a defined collection of items. (This is similar to a typedef or enum in C-related languages.)

To illustrate these, we consider several ways to define a vehicle. In the first, we employ only +$ lusbuc to capture key vehicle characteristics. Using only lusbuc, it’s hard to say much of interest:

+$  vehicle  tape               :: vehicle identification number

By permitting collections of named type values with $% buccen, we can produce more complicated structures:

+$  vehicle
  $%  vin=tape                  :: vehicle identification number
      owner=tape                :: car owner's name
      license=tape              :: license plate
  ==

Type definition arms can rely on other type definition arms available in the subject:

+$  vehicle
  $%  vin=tape                  :: vehicle identification number
      owner=tape                :: car owner's name
      license=tape              :: license plate
      kind=kind                 :: vehicle manufacture details
  ==
+$  kind
  $%  make=tape                 :: vehicle make
      model=tape                :: vehicle model
      year=@da                  :: nominal year of manufacture (use Jan 1)
  ==

Masking Variables

In general, Hoon style does not require you to be careful about masking variable names in the subject (using the same name for the value as the mold). This rarely introduces surprising bugs but is typically contextually apparent to the developer. Check the note on “Shadowing Variables” above for more details.

Finally, by introducing unions with $? bucwut, a type definition arm can validate possible values:

+$  vehicle
  $%  vin=tape                  :: vehicle identification number
      owner=tape                :: car owner's name
      license=tape              :: license plate
      kind=kind                 :: vehicle manufacture details
  ==
+$  kind
  $%  make=tape                 :: vehicle make
      model=tape                :: vehicle model
      year=@da                  :: nominal year of manufacture (use Jan 1)
  ==
+$  make                        :: permitted vehicle makes
  ?(%acura %chrysler %delorean %dodge %jeep %tesla %toyota)

@tas-tagged text elements are extremely common in such type unions, as they afford a human-legible categorization that is nonetheless rigorous to the machine. (This is a like a typedef and constant combined, in that it has only the types in the union.)

Generators

Generators are standalone Hoon expressions that evaluate and may produce side effects, as appropriate. They are closely analogous to simple scripts in languages such as Bash or Python. By using generators, one is able to develop more involved Hoon code and run it repeatedly without awkwardness. Put another way, a generator is a nonpersistent computation: it maps an input to an output.

(You will also see commands beginning with a | symbol; these are %hood commands instead, the CLI agent for Dojo.)

To run a generator on a ship, prefix its name with + lus. Arguments may be required or optional.

+trouble

:: Only for a real ship.
+moon
+moon ~rinset-lapter-sampel-palnet

Naked Generators

The simplest generator is a simple map of input to output without even a broader subject. We’ve used these already, as with +fib.

A naked generator is so called because it contains no metadata for the Arvo interpreter. Its subject is simply the standard Arvo/%zuse/Hoon stack, and its sample is a simple single noun. (Since a noun can be a cell, you can sneak in more than one argument.) Naked generators are nonpersistent computations, thus naked generators are typically straightforward calculators or system queries.

`%say` Generators

More interesting for most cases are %say generators, which can include more information in their sample. (Dojo knows how to handle these as standard cases because they are tagged with %say in the return cell.)

%say generators do know about Arvo and the subject and are able to leverage information from and about the operating system in performing their calculations.

A basic %say generator looks like this:

:-  %say
|=  *
:-  %noun
(sub 1.000 1)

:- composes a cell.
% in front of text indicates a @tas-style constant. Here, this is a type annotation for the handler evaluating the generator.
* is a mold matching any data type, atom or cell. Since the sample is unused, there’s no point in restricting it.

This generator can accept any input (*) or none at all. It returns, in any case, 999.

To match a particular mold, you can specify from this table, with atoms expanding to the right as auras.

Shorthand	Mold
`*`	noun
`@`	atom
`^`	cell
`?`	loobean
`~`	null

The generator itself consists of a cell [%say hoon], where hoon is the rest of the code. The %say metadata tag indicates to Arvo what the expected structure of the generator is qua %say generator.

In general, a %say generator doesn’t need a sample (input arguments) to complete: Arvo can elide that if necessary. More generally, though, a %say generator is useful any time a calculation needs to depend on user input or system parameters (beyond the static system library).

The maximalist sample is a 3-tuple: [[now eny beak] ~[unnamed arguments] ~[named arguments]].

now is the current time. eny is 512 bits of entropy for seeding random number generators. beak contains the current ship, desk, and case.

How do we leave things out?

Any of those pieces of data could be omitted by replacing part of the noun with * rather than giving them faces. For example, [now=@da * bec=beak] if we didn’t want eny, or [* * bec=beak] if we only wanted beak.

Now

In Dojo, you can always produce the current time as an atom using now. This is a Dojo convenience, however, and we need to bind now to a face if we want to use it inside of a generator.

There are also sophisticated representations broken out by value and as @da absolute/@dr relative types.

Entropy

What is entropy? Computer entropy is a hardware or behavior-based collection of device-independent randomness. For instance, “The Linux kernel generates entropy from keyboard timings, mouse movements, and IDE timings and makes the random character data available to other operating system processes through the special files /dev/random and /dev/urandom.”

For instance, run cat /dev/random on a Linux box and observe the output. You’ll need to run Ctrl+C to exit to the prompt. Run it again, and again. You’ll see that the store of entropy diminishes rather quickly because it is thrown away once it is used.

(And you thought that random number generators just used the time as a seed!)

Beak

Paths begin with a piece of data called a beak. A beak is formally a (p=ship q=desk r=case); it has three components, and might look like /~dozbud-namsep/home/11.

You can get this information in Dojo by typing %.

Other Arguments

The full sample prototype for a %say generator looks like [[now, eny, beak] [unnamed arguments] [named arguments]].

You see a similar pattern in languages like Python, which permits (required) unnamed arguments before named “keyword arguments”.

Unnamed Arguments

By “unnamed” arguments, we really mean required arguments; that is, arguments without defaults. We stub out information we don’t want with the empty noun *:

|=  [* [a=@ud b=@ud c=@ud ~] ~]
(add (mul a b) c)

(You can use this in Dojo as well:

=f |=  [* [a=@ud b=@ud c=@ud ~] ~]
(add (mul a b) c)
(f [* ~[1 2 3] ~])

Note that we retain the terminating ~ since the expected sample is a list.

Named Arguments

We can incorporate optional arguments although without default values (i.e., the default value is always type-appropriate ~).

|=  [* ~ [val=@ud ~]]
(add val 2)

To use it (saved as gen/gate.hoon and |commited):

+g =val 4

Since the default value is ~, if you are testing for the presence of named arguments you should test against that value.

Note that, in all of these cases, you are writing a gate |= bartis which accepts [* * ~] or the like as sample. Dojo (and Arvo generally) recognizes that %say generators have a special format and parse the command-line form into appropriate form for the gate itself.

Reading: Tlon Corporation, “Generators”, sections “%say Generators”, “%say generators with arguments”, “Arguments without a cell”

Worked Examples

Rolling Dice

Write a %say generator which simulates scoring a simple dice throw of $n$ six-sided dice. That is, it should return the sum of $n$ dice as inputs. If no number is specified, then only one die roll should be returned.

Since Hoon is functional but random number generators stateful, you should use the =^ tisket rune to replace the current value in the RNG. =^ tisket is a kind of “one-effect monad,” which allows you to change a single part of the subject.

For instance, here is a %say generator that returns a list of n probabilities between 0–100%.
:-  %say
|=  [[* eny=@uv *] [n=@ud ~] ~]
  :-  %noun
  =/  values  `(list @ud)`~
  =/  count  0
  =/  rng  ~(. og eny)
  |-  ^-  (list @ud)
    ?:  =(count n)  values
    =^  r  rng  (rads:rng 100)
  $(count +(count), values (weld values ~[(add r 1)]))
Your command to run this generator in the Dojo should look like this:
+dicethrow, =n 5
(Note the comma separating optional arguments.)

Prime Sieve

The Sieve of Eratosthenes is a classic (if relatively inefficient) way to produce a list of prime numbers. Save this as a file gen/primes.hoon, sync it, and run it as +primes 100. (Be careful not to use too large a number—use Ctrl+C to interrupt evaluation!)

:-  %say
|=  [[* eny=@uv *] [n=@ud ~] ~]
:-  %noun
=<
(siev n)
|%
::
:: Decompose into prime factors in ascending order.
::
++  prime-factors
  |=  n=@ud
  %-  sorter
  ?:  =(n 1)  ~[n 1]
  =+  [i=0 m=n primes=(primes-to-n n) factors=*(list @ud)]
  |-  ^+  factors
  ?:  =(i (lent primes))
    [factors]
  ?:  =(0 (mod m (snag i primes)))
    $(factors [`@ud`(snag i primes) factors], m (div m (snag i primes)), i i)
  $(factors factors, m m, i +(i))
::
:: Find prime factors in ascending order.
::
++  primes-to-n
  |=  n=@ud
  %-  dezero
  ?:  =(n 1)  ~[~]
  ?:  =(n 2)  ~[2]
  ?:  =(n 3)  ~[3]
  =+  [i=0 cands=(siev (div n 2)) factors=*(list @ud)]
  |-  ^+  factors
  ?:  =(i (lent cands))
    ?:  =(0 (lent (dezero factors)))
      ~[n]
    factors
  $(factors [`@ud`(filter cands n i) factors], i +(i))
::
:: Strip off matching modulo-zero components, (mod n factor)
::
++  filter
  |*  [cands=(list) n=@ud i=@ud]
  ?:  =((mod n (snag i `(list @ud)`cands)) 0)
    [(snag i `(list @ud)`cands)]
  ~
::  Find primes by the sieve of Eratosthenes
++  siev
  |=  n=@ud
  %-  dezero
  =+  [i=2 end=n primes=(gulf 2 n)]
  |-  ^+  primes
  ?:  (gth i n)
    [primes]
  $(primes [(clear (sub i 2) i primes)], i +(i))
:: wrapper to remove zeroes after sorting
++  dezero
  |=  seq=(list @)
  =+  [ser=(sort seq lth)]
  `(list @)`(skim `(list @)`ser pos)
++  pos
  |=  a=@
  (gth a 0)
:: wrapper sort---does NOT remove duplicates
++  sorter
  |=  seq=(list @)
  (sort seq lth)
:: replace element of c at index a with item b
++  nick
  |*  [[a=@ b=*] c=(list @)]
  (weld (scag a c) [b (slag +(a) c)])
:: zero out elements of c starting at a modulo b (but omitting a)
++  clear
  |*  [a=@ud b=@ud c=(list)]
  =+  [j=(add a b) jay=(lent c)]
  |-  ^+  c
  ?:  (gth j jay)
    [c]
  $(c [(nick [j 0] c)], j (add j b))
--

Documentation Examples

The following Hoon Workbook examples walk you line-by-line through several %say generators of increasing complexity.

The traffic light example is furthermore an excellent prelude to our entrée to Gall.

Reading: Tlon Corporation, “Hoon Workbook: Digits”

Reading: Tlon Corporation, “Hoon Workbook: Magic 8-Ball”

Reading: Tlon Corporation, “Hoon Workbook: Traffic Light”

%ask Generators (Optional)

%ask generators assume some interactivity with the user. They are less commonly encountered in Arvo since at this point many developers prefer to write whole %gall apps.

For instance, here is the generator to retrieve your +code for web login. (At this point, focus on the structure not the content of this generator.)
::  Helm: query or reset login code for web
::
::::  /hoon/code/hood/gen
  ::
/?    310
/-  *sole
/+  *generators
:-  %ask
|=  $:  [now=@da eny=@uvJ bec=beak]
        [arg=?(~ [%reset ~]) ~]
    ==
=*  our  p.bec
^-  (sole-result [%helm-code ?(~ %reset)])
?~  arg
  =/  code=tape
    %+  slag  1
    %+  scow  %p
    .^(@p %j /(scot %p our)/code/(scot %da now)/(scot %p our))
  =/  step=tape
    %+  scow  %ud
    .^(@ud %j /(scot %p our)/step/(scot %da now)/(scot %p our))
  ::
  %+  print  'use |code %reset to invalidate this and generate a new code'
  %+  print  leaf+(weld "current step=" step)
  %+  print  leaf+code
  (produce [%helm-code ~])
::
?>  =(%reset -.arg)
%+  print  'continue?'
%+  print  'warning: resetting your code closes all web sessions'
%+  prompt
  [%& %project "y/n: "]
%+  parse
  ;~  pose
    (cold %.y (mask "yY"))
    (cold %.n (mask "nN"))
  ==
|=  reset=?
?.  reset
  no-product
(produce [%helm-code %reset])
Look for the following:

/? Kelvin version pin

?~ null check

?> assertion

.^ vane call to %j = %jael

%helm is the CLI app. %sole is a CLI library. {. :callout}

Key Points

Generators are Dojo’s way of importing simple standalone programs.

The subject consists of the context and arguments to a program.

Hoon Basics: Libraries

Overview

Teaching: 20 min
Exercises: 10 min

Questions

How can I import code into a Hoon program?

Objectives

Access built-in libraries.

Create a library in lib/ and utilize it using /+.

Access a library in the Dojo with -build-file.

Compose Hoon that adheres to the Tlon Hoon style guide.

Accessing Built-In Library Cores

Hoon provides a wrapped subject: Arvo wraps %zuse wraps hoon.hoon. These subject components are immediately available to you when you run any program. In addition to these, some contexts expose the [our now eny] pattern of ship-bound knowledge: the current ship identity our, the system time now, and a source of entropy eny. (This, incidentally, is why the boot process is necessary for each instance including fakezods.)

If you need more functionality than these, you can import a library using the Ford / fas runes; specifically,

/- foo, *bar, baz=qux imports a file from the sur directory (* pinned with no face, = with specified face)
/+ foo, *bar, baz=qux imports a file from the lib directory (* pinned with no face, = with specified face)

You can also directly build a file in Dojo with the thread -build-file; this is frequently more convenient for interactive testing.

=foo -build-file %/lib/foo/hoon

At this point, most operations are first-class features of the Hoon subject. For instance, to render a string with URL-compatible codes:

=mytape "Parallax_(Star_Trek:_Voyager)"
(weld "https://en.wikipedia.org/wiki/" (en-urlt:html mytape))

Useful Libraries

Check the contents of the /lib directory with the +ls generator:

> +ls %/lib
  agentio/hoon
  aqua-azimuth/hoon
  aqua-vane-thread/hoon
  azimuth/hoon
  ...

View the contents of a file with the +cat generator:

> +cat %/lib/ethereum/hoon

(No syntax highlighting with +cat, alas!)

You’ll see many of these employed in the Gall agents we compose tomorrow.

Libraries in Use

Scan through the generators in %/gen and see which libraries are used and how they are imported.

There are some notable omissions still which you can rectify for yourself:

ordered maps of sets (see the next lesson for motivation)
lazytrig.hoon provides transcendental functions (unjetted, Hoon-only).
string parsing is still a bit rough (DIY from fundamental components)

Hoon Style

Dogmatically good Hoon style has evolved over the years, and the current Arvo kernel is a palimpsest of styles.

Deducing Style

Examine several source code files. Enumerate some principles of good Hoon style that you infer from these.

Generally speaking, Hoon prefers short variable names with loose mnemonics, left-aligned rune branches, sparse commentary, and preference of whichever irregular form is most expressive in-context.

Hoon’s position on layout is: so long as your code is (a) correctly commented, (b) free from blank or overlong lines, (c) parses, and (d) looks good, it’s good layout.

We do need to distinguish two common forms that you’ve already encountered in vivo:

Wide form fits on a single line.
Tall form uses multiple lines.

Perfect is the enemy of good-enough, particularly when you are getting started. Focus on writing working code with clear intent before getting hung up on formatting.

Tlon Corporation, “Hoon Style Guide”

Key Points

Libraries are code which can be built and included in the subject of downstream code.

Hoon Basics: Advanced Data Structures

Overview

Teaching: 30 min
Exercises: 15 min

Questions

How are structured data typically represented, populated, and accessed in Hoon?

Which common patterns in Hoon exist?

Objectives

Identify common Hoon molds: units, sets, maps, jars, and jugs.

Explore the standard library’s functionality: text parsing & processing, functional hacks, randomness, hashing, time, and so forth.

Data Structures

Urbit employs a series of data structures (molds) which are either unique to the Urbit operating environment (such as a unit) or analogous to convenient forms in other languages (like a set or a map).

Units

Every atom in Hoon is an unsigned integer, even if interpreted by an aura. Auras do not carry prescriptive power, however, so they cannot be used to fundamentally distinguish a NULL-style or NaN-style non-result. That is, if the result of a query is a ~, how does one distinguish a value of zero from a non-result (missing value)?

Units mitigate the situation by acting as a type union of ~ (for no result) and a cell [~ u=item] containing the returned item (with face u).

++  unit
  |$  [item]
  $@(~ [~ u=item])

Many lookup operations return a unit precisely so you can differentiate a failed search from an answer of zero.

Sets

Many compound data structures in Hoon are trees, maps, or sets. Trees are the least important for these for our purposes today, so we will focus on +$map and +$set.

In computer science, a set is like a mathematical set with single-membership of values. A set is not ordered so the results may not be ordered or stored the same as the input.

> (sy ~[5 4 3 2 1])
[n=5 l={} r={1 2 3 4}]
> (sy ~[8 8 8 7 7 6 5 5 4 3 2 1])
[n=6 l={8 5 7} r={1 2 3 4}]

For the most part, we don’t worry too much about the particular structure chosen for internal representation of the set as a tree because we will use the ++in door for interacting with elements.

A door is a kind of core more general than a gate. In particular, rather than a default arm $ the caller must specify which arm of the door to pull. This produces a gate which must then be slammed on the values. The ++in core instruments +$set instances, which makes for a somewhat intuitive English-language reading of operations.

To produce a set, you can either construct it manually from a list using ++sy, or you can insert items using the ++put arm of by.

For each of the following, we assume the following set has been defined in Dojo:

=my-set (sy ~['a' 'B' "sea" 4 .5])

Put a value in the set:
```
  > (~(put in my-set) ~.6)
  [n=.7.6e-44 l=[n=.5 l={} r={}] r=[n=.9.2e-44 l={} r={[i='s' t="ea"] 'a' '\04'}]]
```
(Notice, in the time-honored tradition of functional languages, that my-set is unaltered and a copy of the values is returned.)

Query if value is in set:

  > (~(has in my-set) 'B')
  %.y
  > (~(has in my-set) 'b')
  %.n

Get size of set:
```
  > ~(wyt in my-set)
  5
```

Apply gate to values:

  :: strip auras from all elements in set
  > (~(run in my-set) |=(a=* a))
  {1.084.227.584 66 [115 101 97 0] 97 4}

Convert to list:

  > ~(tap in my-set)
  ~[.6e-45 .1.36e-43 [i='s' t="ea"] .9.2e-44 .5]

There are also a number of set-theoretical operations such as difference (dif), intersection (int), union (uni), etc.

Maps

A +$map is a set of key–value pairs: (tree (pair key value)). (Other languages know this as a dictionary or associative array.) The ++by core provides map services.

To produce a map, you can either construct it manually from cells using ++my, or you can insert items using the ++put or ++gas arms of by. The latter is more robust and we use it here.

Map keys are commonly @tas words.

For each of the following, we assume the following set has been defined in Dojo:

=greek (~(gas by *(map @tas @t)) ~[[%alpha 'α'] [%beta 'β'] [%gamma 'γ'] [%delta 'δ'] [%epsilon 'ε'] [%zeta 'ζ'] [%eta 'η'] [%theta 'θ'] [%iota 'ι'] [%kappa 'κ'] [%lambda 'λ'] [%mu 'μ'] [%nu 'ν'] [%xi 'ξ'] [%omicron 'ο'] [%pi 'π'] [%rho 'ρ'] [%sigma 'σ'] [%tau 'τ'] [%upsilon 'υ'] [%phi 'φ'] [%chi 'χ'] [%psi 'ψ'] [%omega 'ω']])

Put a value in the map:
```
  (~(put by greek) %digamma 'ϝ')
```
(Notice, in the time-honored tradition of functional languages, that my-set is unaltered and a copy of the values is returned.)

Query if value is in set:

  > (~(has by greek) %alpha)
  %.y
  > (~(has by greek) %betta)
  %.n

Get size of set:
```
  > ~(wyt by greek)
  24
```
Get value by key:
```
  > (~(get by greek) %delta)
  [~ 'δ']
```
(What mold is this return value? Why?)
Apply gate to values:
```
  > (~(run by greek) |=(a=@t `@ud`a))
```
Get list of keys:
```
  ~(key by greek)
```
Get list of values:
```
  ~(val by greek)
```
(Note that these are not in the same order.)
Convert to list of pairs:
```
  ~(tap by greek)
```

Many aspects of Hoon, in particular the parser, have their own characteristic data structures, but this should serve to get you started reading and retrieving data from structured sources.

Hoon School, “Trees, Sets, and Maps”

Jars and Jugs (Optional)

A couple of compound structures are also used frequently that have their own terminology:
jar is a map of lists.
  ++  jar  |$  [key value]  (map key (list value))
jug is a map of sets.
  ++  jug  |$  [key value]  (map key (set value))

Standard Library

The Hoon standard library is split between sys/hoon.hoon and sys/zuse.hoon. Since Hoon follows the Lisp dictum that all code is data, the standard library functions can be quite helpful in operating on all sorts of inputs. We will focus on a few particular aspects: text parsing & processing, randomness, hashing, and time.

Text Parsing and Processing

Most of the text parsing library is built to process Hoon itself, and a variety of rule parsers and analyzers have been built up to this end. However, we are interested in the much simpler case of loading a structured text string and converting it to some sort of internal Hoon representation.

Urbit Types from `@t`

Frequently, one receives plaintext data from some source (like an HTTP PUT request) and needs to convert it to a Hoon-compatible atom format. Alternatively, one needs to convert from a raw Hoon atom into a text representation. ++slaw and ++scot provide this functionality:

> (slaw %p '~dopzod')
[~ 4.608]
> (scot %p 1.000)
~.~wanlyn

++slaw converts a given @t cord by a designated aura into a unit of that aura.
++scot converts a given value of any type into a target @ta representation of that value as a given aura. (See also ++scow to convert to a tape.)

JSON Structured Data

JSON is a frequently-employed data representation standard that organizes data into series of lists and key-value pairs (or maps).

{
  "name": "John",
  "age": 30,
  "car": null
}

Hoon has built-in support for parsing and exporting JSON-styled data. THIS IS WHERE THINGS GET WEIRD. Frequently when working in Hoon, it becomes necessary for you the developer to think in terms of structures rather than representations. JSON is frequently thought of as a generic text representation of data, and while it is acknowledged that there are some different but compatible representations possible (such as by manipulating whitespace), in general JSON is the text. Not so for Hoon. It’s a very Lisp-y move to consider the data as more fundamental than the instantiation as one particular type of text, and thus there being multiple valid representations of the same data, some quite different from each other.

The foregoing example becomes in Hoon:

[~
  [%o p={
      [p='car'
       q=~
      ]
      [p='name'
       q=[%s p='John']
      ]
      [p='age'
       q=[%n p=~.30]
      ]
    }
  ]
]

Note the type tags on the information components, and that the map is unordered.

Hoon parses JSON in two passes: first, the JSON is converted in the raw to a tagged data structure using ++de-json:html. After this, schema-specific parsers are used to deconstruct the tagged data into usable components with ++dejs:format. This gets a little complicated, and we’ll revisit it again tomorrow when we need it.

The first part is relatively straightforward. Given a cord (@t) of JSON data, parse it to a tagged json data structure:

> =a '{"name":"John", "age":30, "car":null}'
> a
'{"name":"John", "age":30, "car":null}'
> (de-json:html a)
[~ [%o p={[p='car' q=~] [p='name' q=[%s p='John']] [p='age' q=[%n p=~.30]]}]]

JSON is a structured data type, however, so one cannot generically process it into particular values. One must know and account for the expected structure.

We have to build a parser to extract expected values (and ignore others).

> =parser %-  ot:dejs-soft:format
  :~  [%name so:dejs-soft:format]
      [%age no:dejs-soft:format]
      [%car (mu so):dejs-soft:format]  :: allows for unit result incl. null
  ==
> (parser u:+:(de-json:html a))

The dejs-soft library fails gracefully (as opposed to dejs). It can parse, among others, the following types. (Wrap with ++mu if the quantity is optional.)

da: UTC date
ne: number as real
ni: number as integer
so: string as cord
ul: null

JSON parsing code often ends up rather involved.

Randomness

We encountered random numbers previously when dealing with the six-sided dice problem.

The entropy eny is generated from /dev/random, itself seeded by factors such as keyboard typing latency.

For instance, to grab a particular entry at random from a list, you can convert eny to a valid key and then retrieve:

(snag (~(rad og eny) 5) (gulf 1 5))
(snag (~(rad og eny) 5) `(list @p)`~[~wanzod ~marzod ~binzod ~litzod ~samzod])

The og core provides a number of methods:

++rad for random in range
++raw for random binary bits

The main thing to keep in mind when using the RNG is that you must update the internal state. Since Hoon is a functional programming language without side effects, you have to pin the modified state as the old state’s name to continue to use it.

:-  %say
|=  [[* eny=@uv *] [n=@ud ~] ~]
  :-  %noun
  =/  values  `(list @ud)`~
  =/  count  0
  =/  rng  ~(. og eny)
  |-  ^-  (list @ud)
    ?:  =(count n)  values
    =^  r  rng  (rads:rng 100)
  $(count +(count), values (weld values ~[(add r 1)]))

Time

Hoon supports a native date-time aura, @da (absolute, relative to “the beginning of time”, ~292277024401-.1.1 = January 1, 292,277,024,401 B.C.), and @dr (relative). This is called the atomic date in the docs.

There is also a parsed time format tarp

> *tarp
[d=0 h=0 m=0 s=0 f=~]

and a parsed date format date

> *date
[[a=%.y y=0] m=0 t=[d=0 h=0 m=0 s=0 f=~]]

which are convenient for representations and interconversions.

Numerical Conversions

Since every value is at heart an unsigned decimal integer, there is a well-formed binary representation for each value. This determines the way that the system treats text characters and floating-point values, for instance, and if not careful one can quickly run into absurdity:

> (add 1 .-1)
3.212.836.865

If we specify the aura, we get a glimpse into what is happening:

> `@rs`(add 1 .-1)
.-1.0000001

That is, there is a binary representation of .-1 to which the value 1 has been added:

> `@ub`.-1
0b1011.1111.1000.0000.0000.0000.0000.0000
> `@ub`(add .-1 1)
0b1011.1111.1000.0000.0000.0000.0000.0001

Things can get much worse, too!

> (add .1 .1)
2.130.706.432
> `@rs`(add .1 .1)
.1.7014118e38

Auras have a characteristic core that enables one to consistently work with their values. For instance, for floating-point mathematics, one should use the rs core:

> (add:rs .1 .1)
.2

To convert between numbers, don’t do a simple aura cast, which won’t have the desired effect most of the time:

> `@rs`1.000
.1.401e-42
> `@rs`0
.0

Instead, employ the correct conversion routine:

> (sun:rs 5)
.5

Wrapping Up the First Day

At this point, you’ve been drinking from the firehose and it may have hurt a little.

via Gfycat

Our objective for today was to start you thinking in terms of the data structures of idiomatic Hoon. I want you to absorb as much of this as you can tonight and then tomorrow we will venture into the operating function, userspace, and software deployment, all of which you should be equipped to grapple with now.

If you are interested in exploring some single-purpose generators, I suggest these:

Matt Newport, planetppm.hoon, a raytracer written in Hoon.
N E Davis, julia, a fractal generator written in Hoon.

Key Points

Doors generalize the notion of a gate to a gate-factory.

Units, sets, maps, jars, and jugs provide a principled way to access and operate on data.

Arvo: Kernel Structure

Overview

Teaching: 20 min
Exercises: 0 min

Questions

How is the Urbit kernel structured?

How do I access system operations?

Objectives

Diagram the high-level architecture of Urbit.

Urbit bills itself as a “clean-slate operating system” or “operating function”. Now that you have some of the syntax of Hoon under your belt, we can examine what the kernel does to implement these features. In brief, these are the selling points I make to any developer who asks, “Why bother with Urbit?”

Urbit solves the hard problems of implementing a peer-to-peer network (including identity, NAT traversal, and exactly-once delivery) in the kernel so app developers can focus on business logic.

The entire OS is a single pure function that provides application developers with strong guarantees: automated persistence and memory management, repeatable builds, and support for hot code reloading.

My partial summary is that Urbit offers the developer:

Cryptographic identity.
Version-controlled typed filesystem.
Authentication primitives.
Persistent database.

That is, once you’ve done the admittedly hard shovel work of learning Urbit and building on it, you get a number of nice guarantees for “free,” or at least able to be taken for granted.

We could do worse than to start our exploration of the Urbit kernel than to quote the Whitepaper itself on the subject of Arvo:

The fundamental unit of the Urbit kernel is an event called a +$move. Arvo is primarily an event dispatcher between moves created by vanes. Each move therefore has an address and other structured information.

Arvo is the main lifecycle function which handles these discrete events. What initiates and handles events? These are the vanes, standardized system services. Every vane defines its own structured events (+$moves). Each unique kind of structured event has a unique, frequently whimsical, name. This can make it challenging to get used to how a particular vane behaves.

We focus now on the center of this diagram. The circle in the middle represents the Nock VM function. Wrapping around that, the Arvo subject consists of the Arvo lifecycle function, the Hoon language, and the Zuse/Lull data structures and conventions.

Arvo is essentially an event handler which can coordinate and dispatch messages between vanes as well as emit Unix %unix events (side effects) to the underlying (presumed Unix-compatible) host OS. Arvo as hosted OS does not carry out any tasks specific to the machine hardware, such as memory allocation, system thread management, and hardware- or firmware-level operations. These are left to the king and serf, the daemon runtime processes which together run Arvo.

Arvo is architected as a state machine, the deterministic end result of the event log. We need to briefly examine Arvo from two separate angles:

Event processing engine and state machine (vane coordinator).
Standard noun structure (“Arvo-shaped noun”).

Arvo as Event Processing Engine

A vanilla event loop scales poorly in complexity. A system event is the trigger for a cascade of internal events; each event can schedule any number of future events. This easily degenerates into “event spaghetti.” Arvo has “structured events”; it imposes a stack discipline on event causality, much like imposing subroutine structure on gotos.

Arvo events (+$moves) contain metadata and data. Arvo recognizes three types of moves:

%pass events are forward calls from one vane to another (or back to itself, occasionally), and
%give events are returned values and move back along the calling duct.
%unix events communicate from Arvo to the underlying binary in such a way as to emit an external effect (an %ames network communication, for instance, or text input and output).

A bit more terminology:

A move consists of message data and metadata indicating what needs to happen. A move sends an action to a location along a call stack (or duct).
A card is an event, or action. Cards can have arbitrarily complicated syntax depending on the vane and message. For instance, here is an example of a Gall card:
```
  [%give %fact ~[/status] [%atom !>(status.state)]]
```
There is a complicated interplay of cards as seen from callers and callees, but we will largely ignore that level of detail here.

Card Types (Optional)

Each vane defines a protocol for interacting with other vanes (via Arvo) by defining four types of cards: tasks, gifts, notes, and signs.

In other words, there are only four ways of seeing a move:

as a request seen by the caller, which is a note.

that same request as seen by the callee, a task.

the response to that first request as seen by the callee, a gift.

the response to the first request as seen by the caller, a sign.

To see how an event is processed in the Dojo, type |verb then +ls %. After pressing Enter, you should see something like the following:

["" %unix %belt /d/term/1 ~2021.10.8..21.29.25..aa8c]
["|" %pass [%dill %g] [[%deal [~per ~per] %hood %poke] /] ~[//term/1]]
["||" %give %gall [%unto %poke-ack] i=/dill t=~[//term/1]]
["||" %give %gall [%unto %fact] i=/dill t=~[//term/1]]
["|||" %give %dill %blit i=/gall/use/herm/0wHkGtE/~per/view/ t=~[/dill //term/1]]
["|||" %give %dill %blit i=/gall/use/herm/0wHkGtE/~per/view/ t=~[/dill //term/1]]
["||" %pass [%gall %g] [ [%deal [~per ~per] %dojo %poke]   /use/hood/0wHkGtE/out/~per/dojo/drum/phat/~per/dojo ] ~[/dill //term/1]]
["|||" %give %gall [%unto %poke-ack] i=/gall/use/hood/0wHkGtE/out/~per/dojo/drum/phat/~per/dojo t=~[/dill //term/1]]
["" %unix %belt /d/term/1 ~2021.10.8..21.29.26..2862]
["|" %pass [%dill %g] [[%deal [~per ~per] %hood %poke] /] ~[//term/1]]
["||" %give %gall [%unto %poke-ack] i=/dill t=~[//term/1]]
["||" %give %gall [%unto %fact] i=/dill t=~[//term/1]]
["|||" %give %dill %blit i=/gall/use/herm/0wHkGtE/~per/view/ t=~[/dill //term/1]]
["|||" %give %dill %blit i=/gall/use/herm/0wHkGtE/~per/view/ t=~[/dill //term/1]]
["||" %pass [%gall %g] [ [%deal [~per ~per] %dojo %poke]   /use/hood/0wHkGtE/out/~per/dojo/drum/phat/~per/dojo ] ~[/dill //term/1]]
["|||" %give %gall [%unto %poke-ack] i=/gall/use/hood/0wHkGtE/out/~per/dojo/drum/phat/~per/dojo t=~[/dill //term/1]]
["" %unix %belt /d/term/1 ~2021.10.8..21.29.26..f451]
["|" %pass [%dill %g] [[%deal [~per ~per] %hood %poke] /] ~[//term/1]]
["||" %give %gall [%unto %poke-ack] i=/dill t=~[//term/1]]
["||" %give %gall [%unto %fact] i=/dill t=~[//term/1]]
["|||" %give %dill %blit i=/gall/use/herm/0wHkGtE/~per/view/ t=~[/dill //term/1]]
["|||" %give %dill %blit i=/gall/use/herm/0wHkGtE/~per/view/ t=~[/dill //term/1]]
["||" %pass [%gall %g] [ [%deal [~per ~per] %dojo %poke]   /use/hood/0wHkGtE/out/~per/dojo/drum/phat/~per/dojo ] ~[/dill //term/1]]
["|||" %give %gall [%unto %poke-ack] i=/gall/use/hood/0wHkGtE/out/~per/dojo/drum/phat/~per/dojo t=~[/dill //term/1]]
["" %unix %belt /d/term/1 ~2021.10.8..21.29.28..448d]
["|" %pass [%dill %g] [[%deal [~per ~per] %hood %poke] /] ~[//term/1]]
["||" %give %gall [%unto %poke-ack] i=/dill t=~[//term/1]]
[ "||" %pass [%gall %g] [ [%deal [~per ~per] %dojo %poke]   /use/hood/0wHkGtE/out/~per/dojo/drum/phat/~per/dojo ] ~[/dill //term/1]]
[ "|||" %give %gall [%unto %poke-ack] i=/gall/use/hood/0wHkGtE/out/~per/dojo/drum/phat/~per/dojo t=~[/dill //term/1]]
[ "|||" %give %gall [%unto %fact] i=/gall/use/hood/0wHkGtE/out/~per/dojo/drum/phat/~per/dojo t=~[/dill //term/1]]
["||||" %give %gall [%unto %fact] i=/dill t=~[//term/1]]
["|||||" %give %dill %blit i=/gall/use/herm/0wHkGtE/~per/view/ t=~[/dill //term/1]]
["|||||" %give %dill %blit i=/gall/use/herm/0wHkGtE/~per/view/ t=~[/dill //term/1]]
["|||||" %give %dill %blit i=/gall/use/herm/0wHkGtE/~per/view/ t=~[/dill //term/1]]
["|||" %pass [%gall %c] [%warp /use/dojo/0wHkGtE/~per/drum_~per/hand/gen/ls] ~[/dill //term/1]]
["||||" %give %clay %writ i=/gall/use/dojo/0wHkGtE/~per/drum_~per/hand/gen/ls t=~[/dill //term/1]]
["|||||" %give %gall [%unto %fact] i=/gall/use/hood/0wHkGtE/out/~per/dojo/drum/phat/~per/dojo t=~[/dill //term/1]]
["||||||" %give %gall [%unto %fact] i=/dill t=~[//term/1]]
["|||||||" %give %dill %blit i=/gall/use/herm/0wHkGtE/~per/view/ t=~[/dill //term/1]]
["|||||" %give %gall [%unto %fact] i=/gall/use/hood/0wHkGtE/out/~per/dojo/drum/phat/~per/dojo t=~[/dill //term/1]]
["|||||" %give %gall [%unto %fact] i=/gall/use/hood/0wHkGtE/out/~per/dojo/drum/phat/~per/dojo t=~[/dill //term/1]]
> +ls %
app/ gen/ lib/ mar/ sur/ sys/ ted/ tests/

This involves Dill, Gall, and Clay.

There is an excellent “move trace” tutorial on Urbit.org which covers this in detail. We don’t need to deeply understand this terminology to understand that events are generated by vanes, dispatched by Arvo, and resolved by vanes.

An interrupted event never happened. The computer is deterministic; an event is a transaction; the event log is a log of successful transactions. In a sense, replaying this log is not Turing complete. The log is an existence proof that every event within it terminates.

Tlon Corporation, “Move Trace”

Arvo as Standard Noun Structure

Arvo defines five standard arms for vanes and the binary runtime to use:

++peek grants read-only access to a vane; this is called a scry.
++poke accepts ++moves and processes them; this is the only arm that actually alters Arvo’s state.
++wish accepts a core and parses it against %zuse, which is instrumentation for runtime access.
++come and ++load are used in kernel upgrades, allowing Arvo to update itself in-place.

Each arm possesses this same structure, which means that as the Urbit OS kernel grows and changes the main event dispatcher can remain the same. For instance, when the build vane %ford was incorporated into %clay, no brain surgery was needed on Arvo to make this possible and legible. Only the affected vanes (and any calls to %ford) needed to change.

Tlon Corporation, “Arvo Tutorial”

Key Points

Arvo is an event processor or state machine.

Events are well-defined single operations on the system state.

The binary executable is separated into king and serf, the system state, an event log and runtime support.

Arvo: Vane Roles

Overview

Teaching: 20 min
Exercises: 0 min

Questions

What services do Arvo vanes provide?

How do I access system operations?

Objectives

Understand architecture of the %ames network and communications.

Access file data from %clay.

Access arbitrary revisions of a file through %clay.

Manipulate remote data through %eyre/%iris.

Provide external commands to Arvo through %khan.

Arvo Vanes

Each vane has a characteristic structure which identifies it as a vane to Arvo and allows it to handle moves consistently. Most operations are one of three things:

A scry or request for data (++peek or ++scry or ++on-peek).
An update (++poke).
An evaluation (of a core) (++wish).

In order to orient yourself around the kinds of things the Urbit OS does, it is worth a brief tour of the Arvo vanes. Each vane offers a particular system service. These are sometimes represented schematically as the kernel layer around the Arvo state machine:

`%ames`, A Network

In a sense, %ames is the operative definition of an Urbit ship on the network. That is, from outside of one’s own urbit, the only specification that must be hewed to is that %ames behaves a certain way in response to events. (Of course, without a fully operational ship behind the %ames receiver not much would happen.)

%ames implements a system expecting—and delivering—guaranteed one-time delivery. This derives from an observation in the Whitepaper:

There is a categorical difference between a bus, which transports commands, and a network, which transports packets. You can drop a packet but not a command; a packet is a fact and a command is an order. To send commands over a wire is unforgivable: you’ve turned your network into a bus. Buses are great, but networks are magic.

Facts are inherently idempotent; learning fact X twice is the same as learning it once. You can drop a packet, because you can ignore a fact. Orders are inherently sequential; if you get two commands to do thing X, you do thing X twice.

%ames communicates using the UDP protocol. Data requests into %ames typically locate ship and peer information, protocol version, ship state, etc. Peer discovery is handled via stars and galaxies, followed by peer-to-peer routing after discovery (barring breach).

The runtime tells Ames which physical address a packet came from, represented as an opaque atom. Ames can emit a packet effect to one of those opaque atoms or to the Urbit address of a galaxy (root node), which the runtime is responsible for translating to a physical address. One runtime implementation sends UDP packets using IPv4 addresses for ships and DNS lookups for galaxies, but other implementations may overlay over other kinds of networks.

Tlon Corporation, “Ames Overview”

`%behn`, A Timer

%behn is a simple vane that promises to emit events after—but never before—their timestamp. This is used as a wake-up timer for many deferred events. %behn maintains an event handler and a state.

As the shortest vane, we commend %behn to the student as an excellent subject for a first dive into the structure of a vane.

%behn scries retrieve timers, timestamps, next timer to fire, etc.

Tlon Corporation, “Behn Overview”

`%clay`, A File System

%clay is one of the most significant vanes in Arvo. %clay is a global-namespace typed version-control filesystem, meaning that

it can refer to any value on any ship (although it may not be able to access said value);
it has a type for each noun it holds; and
it retains a full history of the file system.

%clay primarily holds source code files: libraries, generators, threads, agents, marks, etc. The actual source code of %clay is possibly the thickest jungle: it contains some very old conventions and has a particular byzantine naming system compared to later vanes.

The basic data model of %clay is that paths yield files which are basically typed data blobs. The type is a mark, which is a representation and conversion routine. Marks are stored in /mar, and have a system of conversion to other types, conversion from other types, etc.

Marks and conversions

A mark is a rule for a data structure. It’s sort of like a file extension for %clay. %clay also maintains rules for mapping that data structure (such as %json) to another (like %txt). Any %clay path includes the mark to use on that file—it’s not really a file extension, per se, it’s an interpretive rule! Marks live in mar/ and have a standard core structure.

For instance, the mark for %json lives at mar/json.hoon and reads as:

::
::::  /hoon/json/mar
  ::
/?    310
  ::
::::  compute
  ::
=,  eyre
=,  format
=,  html
|_  jon=json
::
++  grow                                                ::  convert to
  |%
  ++  mime  [/application/json (as-octs:mimes -:txt)]   ::  convert to %mime
  ++  txt   [(crip (en-json jon))]~
  --
++  grab
  |%                                                    ::  convert from
  ++  mime  |=([p=mite q=octs] (fall (rush (@t q.q) apex:de-json) *json))
  ++  noun  json                                        ::  clam from %noun
  ++  numb  numb:enjs
  ++  time  time:enjs
  --
++  grad  %mime
--

Marks are used to:

Convert between marks (tubes).
Diff, patch, and merge for %clay’s revision control operations.
Validate untyped nouns.

Each mark has three arms:

++grow converts to the mark (first attempt to convert).
++grab converts from the mark (second attempt to convert).
++grad is used to ++diff, ++pact, ++join, and ++mash the noun.

Besides %clay, the %gall userspace vane uses marks to validate and manipulate the data values being carried by pokes and scries.

paths are resolved within a given desk, which is like a Git branch. Until very recently this only mattered for OTA updates, in which your local %home desk received an update from your Azimuth sponsor’s %kids desk. Now, however, desks have become the primary mode of software distribution. We will revisit them as a practical matter later today.

Now that you have seen a little more of how vanes work, take a gander at the +ls and +cat generators. (These are the most complicated generators we’ll see.)

> +cat %/gen/ls/hoon
/~per/home/~2021.10.8..19.11.55..0a8a/gen/ls/hoon
::  LiSt directory subnodes
::
::::  /hoon/ls/gen
  ::
/?    310
/+    show-dir
::
::::
  ::
~&  %
:-  %say
|=  [^ [arg=path ~] vane=?(%g %c)]
=+  lon=.^(arch (cat 3 vane %y) arg)
tang+[?~(dir.lon leaf+"~" (show-dir vane arg dir.lon))]~

> +cat %/gen/cat/hoon
/~per/home/~2021.10.8..19.06.15..bc83/gen/cat/hoon
::  ConCATenate file listings
::
::::  /hoon/cat/gen
  ::
/?    310
/+    pretty-file, show-dir
::
::::
  ::
:-  %say
|=  [^ [arg=(list path)] vane=?(%g %c)]
=-  tang+(flop `tang`(zing -))
%+  turn  arg
|=  pax=path
^-  tang
=+  ark=.^(arch (cat 3 vane %y) pax)
?^  fil.ark
  ?:  =(%sched -:(flop pax))
    [>.^((map @da cord) (cat 3 vane %x) pax)<]~
  [leaf+(spud pax) (pretty-file .^(noun (cat 3 vane %x) pax))]
?-     dir.ark                                          ::  handle ambiguity
    ~
  [rose+[" " `~]^~[leaf+"~" (smyt pax)]]~
::
    [[@t ~] ~ ~]
  $(pax (welp pax /[p.n.dir.ark]))
::
    *
  =-  [palm+[": " ``~]^-]~
  :~  rose+[" " `~]^~[leaf+"*" (smyt pax)]
      `tank`(show-dir vane pax dir.ark)
  ==
==

`++ford` , A Build System

++ford builds code (either from the Dojo, from a library, from an app, etc.). ++ford used to be a standalone vane but was integrated into %clay in 2020.

++ford provides a number of runes for building and importing code into a subject:

/-: Import a structure file from sur.
/+: Import a library file from lib.
/=: Import a user-specified file.
/*: Import the contents of a file converted by given mark.

Importing with * removes the face (i.e. imports directly into the namespace), while foo=bar renames the face.

Build a Mark (Optional)

A csv file (comma-seperated value or common seperator of values) contains tabular information with entry fields across lines and records spanning down.
Duration,Pulse,Maxpulse,Calories
60,110,130,409.1
60,117,145,479.0
60,103,135,340.0
45,109,175,282.4
45,117,148,406.0
60,102,127,300.0
60,110,136,374.0
45,104,134,253.3
30,109,133,195.1
Compose a mark capable of conversion from a CSV file to a plain-text file (and vice versa).

The ++grad arm can be copied from the hoon mark, since we are not concerned with preserving CSV integrity.
Solution
=,  csv
|_  csv/@t
++  grow
  |%
  ++  txt  text
  --
++  grab
  |%
  ++  noun  @t
  ++  txt   parse
  --
++  grad  %txt
--

`%dill` , A Terminal Driver

%dill handles keypress events generated from the keyboard or telnet. This includes the state of the terminal window (size, shape, etc.) and keystroke-by-keystroke events. %dill scrys are unusual, in that they are typically only necessary for fine-grained Arvo control of the display. Even command-line apps instrumented with %shoe do not call into %dill commonly. The only instance of use in the current Arvo kernel is in Herm, the terminal session manager.

Tlon Corporation, “Dill Overview”

`%eyre` and `%iris`, Server and Client Vanes

%eyre handles HTTP requests from clients. For instance, %eyre handles session cookies for the browser. %eyre is also the main interface for %gall agents to the outside world. Channels are defined as pipelines from external HTTP clients to %eyre as a thin layer over %gall agents, and back again.

%iris handles HTTP requests from servers. For instance, %iris can fetch remote HTTP resources (HTTP GET command).

`%jael` , Secretkeeper

%jael keeps secrets, the cryptographic keys that make it possible to securely control your Urbit. Among other cryptographic facts, %jael keeps track of the following:

Subscription to %azimuth-tracker, the current state of the Azimuth PKI.
Initial public and private keys for the ship.
Public keys of all galaxies.
Record of Ethereum registration for Azimuth.

%jael weighs in as one of the shorter vanes, but is critical to Urbit as a secure network-first operating system. %jael is in fact the first vane loaded after %dill when bootstrapping Arvo on a new instance because it breaks symmetry and provides identity.

Key Points

Vanes communicate by means of events.

%ames provides network communications including peer-to-peer events.

%clay instruments a typed version-controlled filesystem.

%eyre and %iris together offer client and server operations.

%khan is the external control plane.

Arvo: API & Scrying

Overview

Teaching: 20 min
Exercises: 15 min

Questions

How do I access data locally?

How do I access data remotely?

Objectives

Update and access the file system.

Scry for local and remote data.

Use subscriptions to access data.

Enumerate functionality of the Urbit API.

Data tend to live in one of two places in Urbit:

%clay holds files (typically source files).
%gall holds data stores (of all kinds: metadata, data, etc.).

Other data are frequently stored off-ship, such as in an S3 bucket.

Filesystem Operations

Most of the filesystem operations are necessary because of coordination with Earth. That is, one develops software and library code using a text editor or IDE, then needs to synchronize the Martian copy with the Earthling copy. (While there have been a couple of ed clones produced, text editing on Mars is extremely primitive if it exists at all.)

To produce a new desk, one must branch from a current desk:

|merge %new-desk our %base
|mount %new-desk

You have seen |mount and |commit previously, but let’s examine them in light of desks:

:: Mount a desk to the Unix filesystem.
|mount %landscape

:: Commit changes from Unix to Mars.
|commit %landscape

For convenience, you can catch formatted output in the Dojo using *:

*%/output/txt +julia 24

This produces a file home/output.txt (note that the suffix must be a valid mark).

To run a generator on another desk besides %base, you can either change to that desk explicitly or invoke via the desk name.

=dir /=new-desk=
+new-generator
+new-desk!new-generator

Julia Set Fractal Generator (Optional)

Download the file julia.hoon and paste its contents into a new desk named fractals. Commit it and run it using +fractals!julia +24. Redirect its output for a larger input (not more than 100) to a file on desk.

Tlon Corporation, “Filesystem User’s Manual”

Userspace Operations

The main data operations in Urbit userspace are: scries, pokes, and subscriptions.

A scry is like an HTTP GET retrievel of data.
A poke is like an HTTP PUT request for data.
A subscription is a reactive request for data updates.

`++peek`/Scrying

We’ve talked at some length about data access through scrying, but how does one actually do this? Most of the time these events are generated as a matter of course through %gall agents as moves which Arvo automatically handles, but there is a lightweight way to access them as well: .^ dotket.

A .^ dotket call accepts a type p for the result of the call and a compound path which indicates where the call goes and other information.

::  Query %clay for a list of files at the current path.
.^(arch %cy %)

The first letter of the second element (@tas) indicates the destination vane of the move. The second letter of the query is a mode flag. Most vanes only recognize a %x mode, but %clay has a very sophisticated array of calls, including:

%a for file builds
%b for mark builds
%c for cast builds (mark conversion gate)
%w for version number
%x for data
%y for list of children
%z for subtree

Some examples:

::  Build a conversion mark from JSON to txt.
.^(tube:clay %cc /~zod/home/1/json/txt)

::  Query %clay for a list of files at the current path.
.^(arch %cy %)

::  Ask for a file from several hours ago.
.^(arch %cy /(scot %p our)/home/(scot %da (sub now ~h5)))

::  Scry into graph-store for messages
.^(noun %gx /=graph-store=/keys/noun)

::  Scry into metadata-store for current state
.^(noun %gx /=metadata-store=/export/noun)

`++poke`/Poking

Subscriptions

As mentioned previously, Urbit prefers a reactive data flow model in which when a value or data store is updated it notifies all subscribers (rather than having them poll periodically). Internal subscriptions, or subscriptions between ships, take the form of

%clay subscriptions notify subscribers when the filesystem changes.
%gall subscriptions notify subscribers when applications change.
%jael subscriptions notify subscribers when Urbit ID information changes.

A subscription is a stream of events from a publisher to a subscriber, where

the publisher has indicated interest in that stream,

any update sent to one subscriber is sent to all subscribers on the same stream, and

when the publisher has a new update ready,

they send it immediately to their subscribers instead of waiting for a later request.

channels

https://urbit.org/docs/arvo/concepts/subscriptions

Urbit API

Besides the Urbit internal API (which includes the standard arms of vanes and the semantics of moves), there is also the Urbit external API, which handles Mars ⇄ Earth communications. This is accomplished through the intermediary of %eyre which supports two-way channels to external agents and applications.

“%eyre External API Reference”

Key Points

The main data operations in Urbit are: pokes, scries, and subscriptions.

A poke is like an HTTP PUT request for data.

A scry is like an HTTP GET retrievel of data.

A subscription is a reactive request for data updates.

`%gall`: A Minimal Working Example

Overview

Teaching: 20 min
Exercises: 10 min

Questions

What are the structure and expectations of a %gall app?

Objectives

Understand how %gall instruments an app.

Produce a basic %gall app.

“It was six men of Indostan,/To learning much inclined”

On the first day, we talked about bones and trunks and ivory. Today we’re going to meet an elephant.

Unfortunately, there’s not really a good way to modulate the sudden jump in complexity we’re encountering now. You can’t really build an airplane from just a wing: you have to leap from “wing” to “airplane” in one go.

We will proceed at first by simply providing examples of fully-formed %gall agents and discussing their structure and salient features. As our objective today is relatively modest—being able to write and deploy a simple app—this should suffice to whet your appetite.

A Minimal Working Example

Before we do anything substantial with %gall, however, we are simply going to look at a minimal working example. This is the equivalent of a pass statement, it does nothing and talks to no one, whistling in the dark.

/app/alfa.hoon

/-  alfa
/+  default-agent, dbug
|%
+$  versioned-state
  $%  state-0
  ==
::
+$  state-0
  $:  [%0 hexes=(list @ux)]
  ==
::
+$  card  card:agent:gall
::
--
%-  agent:dbug
=|  state-0
=*  state  -
^-  agent:gall
=<
|_  =bowl:gall
+*  this      .
    default   ~(. (default-agent this %|) bowl)
    main      ~(. +> bowl)
::
++  on-init
  ^-  (quip card _this)
  ~&  >  '%alfa initialized successfully'
  =.  state  [%0 *(list @ux)]
  `this
++  on-save   on-save:default
++  on-load   on-load:default
++  on-poke
  |=  [=mark =vase]
  ^-  (quip card _this)
  ?+    mark  (on-poke:default mark vase)
      %noun
    ?+    q.vase  (on-poke:default mark vase)
        %print-state
      ~&  >>  state
      ~&  >>>  bowl  `this
    ==
    ::
      %alfa-action
    ~&  >  %alfa-action
    =^  cards  state
    (handle-action:main !<(action:alfa vase))
    [cards this]
  ==
++  on-arvo   on-arvo:default
++  on-watch  on-watch:default
++  on-leave  on-leave:default
++  on-peek   on-peek:default
++  on-agent  on-agent:default
++  on-fail   on-fail:default
--
|_  =bowl:gall
++  handle-action
  |=  =action:alfa
  ^-  (quip card _state)
  ?-    -.action
    ::
      %append-value
    =.  hexes.state  (weld hexes.state ~[value.action])
    ~&  >>  hexes.state
    :_  state
    ~[[%give %fact ~[/hexes] [%atom !>(hexes.state)]]]
  ==
--

What do you recognize here? What is unfamiliar?

Install this agent by copying in these files:

cp -r src/gall-alfa/* zod/home

(We’ll use a shell script introduced a bit later on to assist with this.)

Until a couple of weeks ago, the syntax to start a %gall agent was |start. As of Grid’s release, this has switched to |rein. Run this by itself to see what it expects as input arguments:

dojo> |rein
>   dojo: nest-need
[ [now=@da eny=@uvJ bec=[p=@p q=@tas r=?([%da p=@da] [%tas p=@tas] [%ud p=@ud])]]
  [desk=@tas arg=it([?(%.y %.n) @tas])]
  liv=?(%.y %.n)
]
>   dojo: nest-have
[ [now=@da eny=@uvJ bec=[p=@p q=@tas r=?([%da p=@da] [%tas p=@tas] [%ud p=@ud])]]
  %~
  liv=?(%.y %.n)
]

We need to do the following in order to have our development desk for the rest of the workshop:

Create a new desk for development.
```
 |merge %echo our %base
```
Mount the desk.
```
 |mount %echo
```
Create the /app/alfa.hoon, /sur/alfa.hoon, and /mar/alfa/action.hoon files (below).
Install the desk (without a %docket file).
```
 |install our %echo
```
Start the agent on the desk.
```
 |rein %echo [& %alfa]
```
Interact with the agent.
```
 :alfa &alfa-action append-value+0xdead.beef
 :alfa %print-state
```
(We distinguish two ways of directly poking an agent. Note that the ++on-poke arm expects [=mark =vase]. If we poke with a bare @tas then the mark is implicitly %noun. Otherwise, we specify a mark with & then the required data, which gets wrapped in a vase.)
At any point, you can check the internal state using the debug subject wrapper:
```
 :alfa +dbug
```

Any nontrivial app needs to define some shared files, which is one of the reasons this is an elephant. In particular, a shared structure in /sur and a mark in /mar are required to handle data transactions.

The shared structure defines common molds like actions and expected structural definitions (e.g. as tagged unions or associative arrays).

/sur/alfa.hoon

|%
+$  action
  $%  [%append-value value=@ux]
  ==
--

Most marks are straightforward or can be developed by glancing at others.

/mar/alfa/action.hoon

/-  alfa
|_  =action:alfa
++  grab
  |%
  ++  noun  action:alfa
  --
++  grow
  |%
  ++  noun  action
  --
++  grad  %noun
--

Exercise

Examine /sur/dns.hoon (basic) or /sur/bitcoin.hoon (advanced).

Examine /mar/json.hoon.

A Shell Script

Place this shell script into your root working directory and use it to update each %gall agent:
#! /bin/sh
mkdir -p $1/home/mar/$2
yes | cp $2/src/mar/action.hoon $1/home/mar/$2
yes | cp $2/src/sur/$2.hoon $1/home/sur
yes | cp $2/src/app/$2.hoon $1/home/app
Usage:
./copy-in.sh zod alfa
Note that this assumes you have already run |mount % and that you run |commit %home after each Unix-side update.

How It Works

Every Gall agent is a door with two components in its subject:

bowl:gall for Gall-standard tools and data structures
App state information

The bowl is a collection of information which renders the agent legible to Arvo, such as providing the subscriptions:

++  bowl              ::  standard app state
  $:  $:  our=ship    ::  host
          src=ship    ::  guest
          dap=term    ::  agent
      ==              ::
      $:  wex=boat    ::  outgoing subscriptions
          sup=bitt    ::  incoming subscriptions
      ==              ::
      $:  act=@ud     ::  change number
          eny=@uvJ    ::  entropy
          now=@da     ::  current time
          byk=beak    ::  load source
  ==  ==

For instance, the incoming subscriptions are a map from the duct (or (list path)) to a particular path on a particular ship.

+$  bitt  (map duct (pair ship path))

The duct is the main construct for tracking information. (Think back to our discussion of scrying: this is the same concept in new clothes.) The path or wire (same thing) bears a characteristic structure for each vane. For instance, a directory listing from %clay is a simple path into /c with a tag y indicating the type of request:

> `path`[%cy /===/sys/vane]
/cy/~sev/home/~2021.9.17..17.14.19..3635/sys/vane
> `(list @t)`[%cy /===/sys/vane]
<|cy ~sev home ~2021.9.17..17.15.48..8a3f sys vane|>
> `(list @ta)`[%cy /===/sys/vane]
/cy/~sev/home/~2021.9.17..17.15.58..ccd9/sys/vane
> `(list @tas)`[%cy /===/sys/vane]
~[%cy %~sev %home %~2021.9.17..17.16.01..a51e %sys %vane]

(This path can be directly executed with .^ dotket as follows: .^(arch %cy /===/sys/vane).)

A %gall path could look like this:

> `path`[%gx /=settings-store=/has-entry/urbit-agent-permissions/'http://localhost:3000'/noun]

So this is called a path but it’s really a complete package of request type, agent information, and data with metadata.

Agents v. Apps

The terminology for userspace has not yet completely solified. I strive to use “agent” to refer to a particular running instance, like a “container” in Docker, whereas an “app” is more like a Docker “image”, the archetypal instance. However, I will frequently and inadvertently use “agent” in synecdoche to refer to apps.

Key Points

A %gall app expects certain arms to be present to handle and emit events.

`%gall`: Adding Functionality

Overview

Teaching: 30 min
Exercises: 15 min

Questions

How can I build a %gall app which operates on data?

Objectives

Manage internal %gall state.

Scry for needed information.

Generally speaking, the Urbit data flow model is reactive, meaning that rather than poll for updates periodically one subscribes to a data source which notifies all subscribers when a change occurs.

Arvo defines a number of standard operations for each vane. Notable among these are peeks, which grant read-only access to data, called a scry; and pokes, which accept moves and process them. Pokes actually alter the agent’s (and Arvo’s) state (rather than just retrieve information).

We are going to widen our view a little bit as well with this agent: we will not use the default arms but will define our own NOP defaults. This way you will be able to see what sort of information each arm processes.

/app/bravo.hoon:

/-  bravo
/+  default-agent, dbug
|%
+$  versioned-state
  $%  state-0
  ==
::
+$  state-0
  $:  [%0 hexes=(list @ux)]
  ==
::
+$  card  card:agent:gall
::
--
%-  agent:dbug
=|  state-0
=*  state  -
^-  agent:gall
=<
|_  bowl:gall
+*  this      .
    default   ~(. (default-agent this %|) bowl)
    main      ~(. +> bowl)
::
++  on-init
  ^-  (quip card _this)
  ~&  >  '%bravo initialized successfully'
  =.  state  [%0 *(list @ux)]
  `this
::
++  on-save
  ^-  vase
  !>(state)
::
++  on-load
  |=  old-state=vase
  ^-  (quip card _this)
  ~&  >  '%bravo recompiled successfully'
  `this(state !<(versioned-state old-state))
::
++  on-poke
  |=  [=mark =vase]
  ^-  (quip card _this)
  ?+    mark  (on-poke:default mark vase)
      %noun
    ?+    q.vase  (on-poke:default mark vase)
        %print-state
      ~&  >>  state
      ~&  >>>  bowl  `this
        [%print-pop @ux]
      ~&  >>  +>:vase  `this
    ==
    ::
      %bravo-action
    ~&  >  %bravo-action
    =^  cards  state
    (handle-action:main !<(action:bravo vase))
    [cards this]
  ==
::
++  on-watch
  |=  =path
  `this
::
++  on-leave
  |=  =path
  `this
::
++  on-peek
  |=  =path
  *(unit (unit cage))
::
++  on-agent
  |=  [wire sign:agent:gall]
  `this
::
++  on-arvo
  |=  [=wire =sign-arvo]
  `this
::
++  on-fail
  |=  [=term =tang]
  `this
--
|_  =bowl:gall
++  handle-action
  |=  =action:bravo
  ^-  (quip card _state)
  ?-    -.action
    ::
      %push
    =.  hexes.state  (weld hexes.state ~[value.action])
    ~&  >>  hexes.state
    :_  state
    ~[[%give %fact ~[/hexes] [%atom !>(hexes.state)]]]
    ::
      %pop
    =/  popped  (rear hexes.state)
    =.  hexes.state  (snip hexes.state)
    ~&  >>  hexes.state
    :_  state
    :~  [%give %fact ~[/hexes] [%atom !>(hexes.state)]]
        [%pass /print-pop %agent [our.bowl %charlie] %poke %noun !>([%print-pop popped])]
    ==
  ==
--

You should copy the structure file and mark file from %alfa and adapt them as appropriate for %bravo. This should be a matter of copying to the appropriate path and changing any internal references.

The structure file should accommodate the following actions:

%push (equivalent in effect to the old %append-value), [%push value=@ux]
%pop (will remove the most recent item from the list and return it), [%pop ~]

:bravo &bravo-action push+0xacdc
:bravo &bravo-action pop+~

We will also accommodate external scrying into the agent through the ++on-peek arm. Once the above works correctly, you should add in an augmented ++on-peek arm:

++  on-peek
  |=  =path
  ^-  (unit (unit cage))
  ?+    path  (on-peek:default path)
      [%x %hexes ~]
    ``noun+!>(hexes)
  ==

This arm typically accepts two kinds of scries (called cares):

%x represents data. %x scries typically return a cage with mark.
%y represents paths. %y will return a cage of mark %arch and vase type arch.

For this case, we only need to return data, so we will only support %gx scries. (We have nothing path-like in this agent.)

Scry results are directly accessible via .^ dotket operations at the Dojo prompt (and more generally to other agents). However, scries can only be performed locally—there are no remote scries as a security mechanism. Remote agent data must be formally requested via a poke and return.

.^((list @ux) %gx /=bravo=/hexes/noun)

units, cages, and vases, Oh My!

A unit allows us to distinguish “no result” from “zero result”. Since every atom in Hoon is an unsigned integer, this allows us to tell the difference between an operation that has no possible result and an operation that succeeded but returned ~ or 0. A unit can be trivially produced from any value by prefixing a tic mark \`.

A vase wraps a value in its type. A vase generally results from the type spear -:!>().

A cage is a marked vase; that is, a vase with additional information about its structure. A cage is more or less analogous to a file in a regular filesystem.

These bear the following relationship to a simple atom:
> !>(1)
[#t/@ud q=1]
> (vase !>(1))
[#t/@ud q=1]
> (cage `(vase !>(1)))
[p=%$ q=[#t/@ud q=1]]
(That last cage’s p means that the value is a constant.)

We would be remiss to not also address arch:

An arch is basically a file directory (in %clay) or a list of store paths (in %gall).

With the completion of this exercise, you have seen how to alter and query state using command-line and agent-based tools. Next, we will take a look at other means for manipulating agent state.

Key Points

A %gall app can be outfitted with a helper core to provide necessary operations.

`%gall`: Interfacing with a Client

Overview

Teaching: 60 min
Exercises: 30 min

Questions

How can I build a %gall app which operates on data?

Objectives

Produce an intermediate %gall app.

Understand how %gall interfaces with an external client.

Communications

https://urbit.org/docs/userspace/graph-store/sample-application-overview

We need to examine all of the ways a %gall app can communicate with the outside world. Recall that an agent has ten arms:

|_  =bowl:gall
++  on-init
++  on-save
++  on-load
++  on-arvo
++  on-peek
++  on-poke
++  on-watch
++  on-leave
++  on-agent
++  on-fail
--

Arvo alone interacts with several of these:

++  on-init
++  on-save
++  on-load
++  on-arvo

The ++on-agent and ++on-fail arms are called in certain circumstances (i.e. as an update to a subscription to another agent or cleanup after a %poke crash). We can leave them as boilerplate for now.

If you are exposing information, you can do so via a peek (++on-peek), a response to a poke (++on-poke), or a subscription (++on-watch). (++on-leave handles cleanup after a terminated subscription.) These are the main ways that the Urbit API protocol (formerly Airlock) interacts with an agent on a ship. We’ll focus on these.

`++on-peek` Scry

A scry represents a direct look into the agent state using the Nock .^ dotket operator.

Only local scries are permitted.

`++on-poke` Request

A poke initiates some kind of well-defined action by an agent. Typically this either triggers an event (such as charlie and bravo’s modification of hexes) or requests a data return of some kind.

Remote pokes are allowed (and common for single-instance requests).

`++on-watch` Subscription

A subscription is a data-reactive standing request for changes. For instance, one can watch a database agent for any changes to the database. Whenever a change occurs, the agent notifies all subscribers, who then act as they should in the event of a message being received (e.g. from a particular ship).

Remote subscriptions are in common use.

The agent

%charlie is yet another upgrade of %bravo which allows remote ships to poke each other peer-to-peer and push hex values to or pop hex values from each others’ hexes:

/app/charlie.hoon:

/-  charlie
/+  default-agent, dbug
|%
+$  versioned-state
  $%  state-0
  ==
::
+$  state-0
  $:  [%0 hexes=(list @ux)]
  ==
::
+$  card  card:agent:gall
::
--
%-  agent:dbug
=|  state-0
=*  state  -
^-  agent:gall
=<
|_  =bowl:gall
+*  this      .
    default   ~(. (default-agent this %|) bowl)
    main      ~(. +> bowl)
::
++  on-init
  ^-  (quip card _this)
  ~&  >  '%charlie initialized successfully'
  =.  state  [%0 *(list @ux)]
  `this
++  on-save   on-save:default
++  on-load   on-load:default
++  on-poke
  |=  [=mark =vase]
  ^-  (quip card _this)
  ?+    mark  (on-poke:default mark vase)
      %noun
    ?+    q.vase  (on-poke:default mark vase)
        %print-state
      ~&  >>  state
      ~&  >>>  bowl
      `this
      ::
        [%push-local @ux]
      ~&  >  "got poked from {<src.bowl>} with val: {<+.q.vase>}"
      =^  cards  state
      (handle-action:main ;;(action:charlie q.vase))
      [cards this]
      ::
        [%pop-local ~]
      ~&  >  "got poked from {<src.bowl>} with val: {<+.q.vase>}"
      =^  cards  state
      (handle-action:main ;;(action:charlie q.vase))
      [cards this]
    ==
    ::
      %charlie-action
    ~&  >  %charlie-action
    =^  cards  state
    (handle-action:main !<(action:charlie vase))
    [cards this]
  ==
++  on-arvo   on-arvo:default
++  on-watch  on-watch:default
++  on-leave  on-leave:default
++  on-peek
  |=  =path
  ^-  (unit (unit cage))
  ?+    path  (on-peek:default path)
      [%x %hexes ~]
    ``noun+!>(hexes)
  ==
++  on-agent  on-agent:default
++  on-fail   on-fail:default
--
|_  =bowl:gall
++  handle-action
  |=  =action:charlie
  ^-  (quip card _state)
  ?-    -.action
    ::
      %push-remote
    :_  state
    ~[[%pass /poke-wire %agent [target.action %charlie] %poke %noun !>([%push-local value.action])]]
    ::
      %push-local
    =.  hexes.state  (weld hexes.state ~[value.action])
    ~&  >>  hexes.state
    :_  state
    ~[[%give %fact ~[/hexes] [%atom !>(hexes.state)]]]
    ::
      %pop-remote
    :_  state
    ~[[%pass /poke-wire %agent [target.action %charlie] %poke %noun !>(~[%pop-local])]]
    ::
      %pop-local
    =.  hexes.state  (snip hexes.state)
    ~&  >>  hexes.state
    :_  state
    ~[[%give %fact ~[/hexes] [%atom !>(hexes.state)]]]
  ==
--

At this point, if you are running a fakezod then the fakezods must be able to see each other over the local network. Typically this means running two different fakezods on the same host machine. Alternatively, you can spin up a comet or moon and do this with your teammates. We have no filtering for agent permissions here. This will be critical for real-world deployments.

A %charlie agent needs to know how to do two things: receive a push (with data) and receive a pop (here, functionally, a delete rather than a return).

/sur/charlie.hoon

|%
+$  action
  $%  [%push-remote target=@p value=@ux]
      [%push-local value=@ux]
      [%pop-remote target=@p]
      [%pop-local ~]
  ==
--

/mar/charlie/action.hoon

/-  charlie
|_  =action:charlie
++  grab
  |%
  ++  noun  action:charlie
  --
++  grow
  |%
  ++  noun  action
  --
++  grad  %noun
--

The actions:

:charlie &charlie-action [%push-remote ~sampel-palnet 0xbeef]
:charlie &charlie-action [%push-local 0xbeef]
:charlie &charlie-action [%pop-remote ~sampel-palnet]
:charlie &charlie-action pop-local+~

Graph Store and Permissions on Mars (Optional)

Many Gall apps use Graph Store, a backend data storage format and database that both provides internally consistent data and external API communications endpoints.

To understand Graph Store, think in terms of the Urbit data permissions model:

A store is a local database.

A hook is a permissions broker for the database. They request and return information after negotiating access with remote agents.

A view is a data aggregator which parses JSON objects for external clients such as Landscape.

Graph Store handles data access perms at the hook level, not at the store level.

References

Tlon Corporation, “Graph Store Overview”

Tlon Corporation, “Graph Store Sample Application: Library”

Key Points

A %gall app can talk to a user interface client.

`%gall`: Constructing an App

Overview

Teaching: 0 min
Exercises: 45 min

Questions

How can I build a %gall app starting from a minimal working example?

Objectives

Produce a new %gall app with specified behavior.

Building %delta to Count Pokes

This entire section is a team exercise. You should work in a GitHub repository to which you all have access and employ basic pair programming to talk through your solution process.

Your team’s objective is to update the %charlie agent to a new %delta agent, which does the following:

Accept remote pokes from another ship. Pokes should be either increment or decrement requests.

Upon receipt of an increment request, increment a count in a map of @p→@ud for each ship. Add the key if necessary.

Upon receipt of a decrement request, decrement the corresponding counter.

In either case, print the resulting value on the local ship and send a poke to the remote ship to make it print the result as well.
Hints
While it is possible to set up a remote Urbit testnet, for our purposes it is simpler to test the agent using two (or more) fakezods on a single computer. (You can also use several comets on the live network.)

Of course, since you can only have one ~zod, you should pick another ship (galaxy, star, or planet). Use @p on any unsigned integer for inspiration, then boot a new clean ship and backup as on Day One.
You can distinguish output lines visually using the >/>>/>>> syntax for ~&:
  ~&  >  "log"        :: blue (log)
  ~&  >>  "warning"   :: yellow (warning)
  ~&  >>>  "error"    :: red (error)
You should store the sending ship and counter as a map. Use the ++by door to work with maps.

Most of the arms can be left alone as defaults: for instance, there is no subscription in this model.

Key Points

A %gall app can be readily produced to a standard of operability.

`%gall`: Releasing an App

Overview

Teaching: 15 min
Exercises: 30 min

Questions

How can I deploy a release-worthy %gall app?

How can I authenticate my app?

Objectives

Sign and deploy the app.

Install the app on another ship (planet, moon, or comet).

Installing an App

For a long time Urbit was too unstable to develop on easily, but over the past couple years that’s improved greatly. Now the biggest hurdle to getting more apps on Urbit is that there’s no real way to distribute them.

A pyramid capstone from Amenemhat III

What makes a regular %gall app into a deployed app? In the past, anyone wanting to use a %gall app has needed to access a repo, copy the files down into app/, and manually start the app. If something was buggy, you risked lobotomizing your ship, so it was common to set up moons to run installed agents on. Not total bedlam, but not for the faint of heart.

Grid and %docket automate most of the process now for users. Software distribution has now become a first-class service on Mars.

References

Installing an App

We can install a remote app onto our local ship at the command line (rather than via the browser UI).

> |install ~paldev %pals
>=
ames: czar dev.urbit.org: ip .35.227.173.38
; ~paldev is your neighbor
kiln: activated install into %pals from [~paldev %pals]
kiln: downloading update for %pals from [~paldev %pals]
kiln: finished downloading update for %pals from [~paldev %pals]
kiln: merging into %pals from [~paldev %pals]
kiln: merge into %pals succeeded
kiln: commit detected at %pals from [~paldev %pals]
gall: installing %pals

(Or try |install ~paldev %picture or |install ~dister-dotnet-ritpub-sipsyl %urchatfm for other examples.)

This should automatically start any appropriate agents. Check agent status per desk:

> +agents %pals
status: running   %pals

Resources

~palfun-foslup, “pals”

Releasing an App

You have seen how to install and activate an agent on your local ship. Let’s look at how to set up and deploy your own app onto the network.

The basic concept of software distribution for Urbit is that a ship has a desk with self-contained agent code and a %docket mark which

Broadly, speaking, desks look the same, except for some modest additions for agent registration. The directories still obtain as follows:

/app for agents
/gen for generators
/lib for library and helper files
/mar for marks
/sur for shared structures
/ted for threads

These new files contain critical information to instrument the distributed software:

/sys/kelvin: kernel Kelvin version (required)

base/sys.kelvin:
```
  [%zuse 420]
```

/desk/bill: a list of agents to run automatically (for %kiln) (optional)

base/desk.bill:

  :~  %acme
      %azimuth-tracker
      %dbug
      %dojo
      %eth-watcher
      %hood
      %herm
      %lens
      %ping
      %spider
  ==

/desk/docket: app metadata (for %docket) (optional)

base/sur/docket.hoon:

  +$  clause
    $%  [%title title=@t]
        [%info info=@t]
        [%color color=@ux]
        [%glob-http url=cord hash=@uvH]
        [%glob-ames =ship hash=@uvH]
        [%image =url]
        [%site =path]
        [%base base=term]
        [%version =version]
        [%website website=url]
        [%license license=cord]
    ==

Example:

  :~
    title+'Delta'
    info+'A distributed peer-to-peer poke demonstration app.'
    color+0xcd.cdcd
    glob-ames+[~zod 0v0]
    image+'https://upload.wikimedia.org/wikipedia/commons/thumb/f/f5/Greek_uc_delta.svg/1200px-Greek_uc_delta.svg.png'
    base+'delta'
    version+[0 0 1]
    license+'MIT'
    website+'https://en.wikipedia.org/wiki/Delta_Force'
  ==

Setup

To set this up, the first thing we need to do is create a new desk in %clay which will hold all of the relevant information about the app, including files and metadata. Typically we base this on our %base desk:

|merge %new-desk ~sampel-palnet %base

Mount and commit the appropriate files. Then make the new desk public:

|public %new-desk

This is a $docket mark with annotation:

+$  href                                :: where a tile links
  $%  [%glob base=term =glob-location]  :: location of client-side data
      [%site =path]                     :: location of server-rendered frontend
  ==                                    ::
::                                      ::
+$  url   cord                          :: URL type
::                                      ::
+$  glob-location                       :: how to retrieve glob (client-side)
  $%  [%http =url]                      :: HTTP source, if applicable
      [%ames =ship]                     :: %ames source, if applicable
  ==                                    ::
::                                      ::
+$  version                             :: version of app (not Kelvin version)
  [major=@ud minor=@ud patch=@ud]       ::
::                                      ::
+$  docket                              ::
  $:  %1                                :: Docket protocol tag
      title=@t                          :: text on home screen
      info=@t                           :: long-form description
      color=@ux                         :: app tile color
      =href                             :: link to client bundle
      image=(unit url)                  :: app tile background
      =version                          :: version of app (not Kelvin version)
      website=url                       :: URL to open on click
      license=cord                      :: software release license
  ==                                    ::

References

Tlon Corporation, “Distribution”

Key Points

The Urbit software distribution service affords a straightforward way to deploy, update, and remove %gall apps.

Next Steps

Overview

Teaching: 15 min
Exercises: 0 min

Questions

What should I do next to contribute to the Urbit community?

Objectives

Install the app on another ship (planet, moon, or comet).

What Else is There?

We have had a whirlwind tour of developing basic %gall agents for Urbit. We have necessarily had to leave out some important topics, including:

Graph Store operations and data format
Endpoint operations using %eyre
Detailed JSON production and parsing
Remote subscriptions and kicks
Front-end development (à la Landscape)
Threading for transient data requests
Composition of store/hook/view arrangement

For a demonstration of a few of these, I suggest you examine dcSpark/authenticate-with-urbit-id, which is a small agent that demonstrates parsing JSON input, subscribing to Graph Store, and some other simple elements of %gall agents.

For a more complex single-purpose example including front-end development, you should examine yosoyubik/canvas, a peer-to-peer drawing app with a browser frontend and an Urbit backend.

To learn more about threads, read about Spider in the docs and examine ted/example-fetch.hoon in your own Urbit. Threading is particular useful for handling complicated IO outside of an agent without compromising the agent’s internal state.

Concrete Next Steps

Talk to everyone here. Find out what’s going on, what folks are working on, and how you can plug into the bigger picture.
Identify new ways to use what Urbit offers within your sphere of influence and responsibility.
Just start making things! Lots of small projects are helpful for you to grow and become comfortable with the shifts in mental model and frame that Urbit affords.

Joining the Urbit Developer Community

Discovery

urbit/awesome-urbit

Contributing

The Urbit ecosystem is primarily developed by Tlon Corporation, a few companies (Urbit.Live, Tirrel, dcSpark, and some hosting companies), and an army of open-source contributors. The types of projects which you can contribute to include:

End-user applications (“userspace”, %gall)\
Operating system functionality (Arvo & vanes)
Runtime (king/serf, jets)

As the Urbit developer community grows, there will be opportunities as well for various types of services, ranging from service hosts to star and galaxy owners.

The Tlon contributor’s guide largely focuses on kernel contributor discipline, but provides a good framework for approaching development of any part of the ecosystem.

If you have not participated in an open-source project before, please check these resources as well:

Gregory V. Wilson, “Joining a Project” (good advice from my mentor)
Zara Cooper, “Getting started with contributing to open source” (more generic advice from SO)
Philip Monk ~wicdev-wisryt, “Urbit Precepts” (the philosophy underlying Urbit’s architectural decisions)

Main Groups

Some community-facing groups which you can use to learn more about programming and developing in Urbit include:

~hiddev-dannut/new-hooniverse (Hooniverse, beginner-oriented)
~littel-wolfur/the-forge (The Forge, general development)
~dasfeb/smol-computers (Smol Computers, running on non-x86 hardware)
~pindet-timmut/urbitcoin-cash (BTC)

Urbit Foundation Opportunities

The Urbit Foundation maintains an active developer outreach including:

Bounties
Apprenticeships
Grants

Talk to Josh Lehman ~wolref-podlex for more information.

Resources

Key Points

The Urbit software distribution service affords a straightforward way to deploy, update, and remove %gall apps.

Urbit for Developers

Hoon Basics: Runes | Basic Programs

Overview

Creating a Fakezod

Reading the Runes

Charting Rune Children

A Generator

Project Euler Problem #1

Irregular Forms

Abstract Syntax Tree (Optional)

Key Points

Hoon Basics: Data Structures

Overview

Nouns

Atoms

Auras

Operators

Cells

Cell Construction

Solution

Hoon as Nock Macro (Optional)

Some Data Structures

Lists

Text

Unicode in Urbit

Sudan Function

Key Points

Hoon Basics: Subject-Oriented Programming

Overview

Subject-Oriented Programming

Addressing Redux

Addressing the Fruit Tree

Solution

Cores and Derived Structures

Cores

Shadowing Names (Optional)

Limb Resolution Paths

Operators (Optional)

Traps

Gates

Doors

Custom Types

Manipulating the Sample Directly (Optional)

Code as Specification (Optional)

Molds

Masking Variables

Generators

Naked Generators

%say Generators

Now

Entropy

Beak

Other Arguments

Unnamed Arguments

Named Arguments

Worked Examples

Rolling Dice

Prime Sieve

Documentation Examples

%ask Generators (Optional)

Key Points

Hoon Basics: Libraries

Overview

Accessing Built-In Library Cores

Useful Libraries

Libraries in Use

Hoon Style

Deducing Style

Key Points

Hoon Basics: Advanced Data Structures

Overview

Data Structures

Units

Sets

Maps

Jars and Jugs (Optional)

Standard Library

Text Parsing and Processing

Urbit Types from @t

JSON Structured Data

`%say` Generators

`%ask` Generators (Optional)

Urbit Types from `@t`

`%ames`, A Network

`%behn`, A Timer

`%clay`, A File System

`++ford` , A Build System

`%dill` , A Terminal Driver

`%eyre` and `%iris`, Server and Client Vanes

`%jael` , Secretkeeper

`++peek`/Scrying

`++poke`/Poking

`unit`s, `cage`s, and `vase`s, Oh My!

`++on-peek` Scry

`++on-poke` Request

`++on-watch` Subscription

Building `%delta` to Count Pokes