Links Syntax

Comments

Comments are introduced by a hash mark, #, and continue for the rest of the line.

Literals

Arithmetic

Links supports the standard arithmetic operators.

 +                : (Int, Int) -> Int
 -                : (Int, Int) -> Int
 *                : (Int, Int) -> Int
 /                : (Int, Int) -> Int
 ^                : (Int, Int) -> Int
 mod              : (Int, Int) -> Int
 *.               : (Float, Float) -> Float
 +.               : (Float, Float) -> Float
 -.               : (Float, Float) -> Float
 /.               : (Float, Float) -> Float
 ^.               : (Float, Float) -> Float

As Links does not yet have any support for overloading, the floating point versions are distinguished using the ``.'' suffix. The arithmetic operators can be used infix as is or prefix when enclosed in parentheses.

  1+2*3

returns 7.

  (*.)(6.0, 7.)

returns 42..

Lists

A list is a finite sequence of values, constructed using [] (pronounced ``nil'') and :: (pronounced ``cons'').

   1 :: 4 :: 9 :: 16 :: []

A list can be created directly by wrapping a series of comma-separated expressions between brackets:

    [1, 4, 9, 16]
    
    ["apple", "banana", "pear"]
    []
    x = true;
    [true, false, x, true]

Note that all elements of a list must be of the same type.

Lists support the ``concatenate'' operation, denoted by two plus characters:

    [1, 2] ++ [3, 4, 5] == [1, 2, 3, 4, 5]

And they're comparable, with ==, as you can see!

The ``Cons'' operator (::) appends an element to the start of a list.

  links> 1::[2,3,4,5];
  [1, 2, 3, 4, 5] : [Int]

The head (hd) and tail (tl) functions each take a single list as an argument. They return respectively: the first element of the list; and the list consisting of all elements but the first from the original list.

  links> hd([1,2,3]);
  1 : Int
  links> tl([1,2,3]);
  [2, 3] : [Int]

The take and drop functions return respectively: the first n elements of a list; and all but the first n elements of a list.

  links> take(2,[1,2,3]);
  [1, 2] : [Int]
  links> drop(2,[1,2,3]);
  [3] : [Int]

Pattern matching on lists

Cons and nil can also be used in patterns, to deconstruct lists:

  switch (s) {
    case []    -> Empty
    case x::xs -> NonEmpty
  }

Integer Ranges

The syntax [a .. b] constructs a list of all the integers between a and b, inclusive. The result is empty if a is greater than b.

Pairs, Tuples, and Records

A tuple is a finite sequence of values, possibly of different types. An n-tuple is constructed by enclosing n expressions in parenthesis and separating them with commas:

   (a, "OK")
   (true, "a string", 42)

A record is something like a tuple, but its fields are indexed by field names rather than integer indices. A record is written like a tuple but with fieldnames preceeding the fields:

  (lastname="Bond", firstname="James", license="To kill")

Field names are not enclosed in quotes and are not expressions. They are not evaluated. Note that, whereas the content of a field can be any expression, the field name must be literally present when constructing a record. For example:

  var item = (drinkname = "latte", price = 2.0 +. 0.5)          # OK
  var item = ("drink" + "name" = "latte", price = 2.0 +. 0.5)   # NOT OK

You can access the fields of a record by projecting them, using the dot notation:

  item.drinkname == "latte"
  (lastname="Bond", firstname="James").lastname == "Bond"

Record modifications

You can add a field to an arbitrary record using the record extension operation. This operation works only when the field is not already present in the record. As an example:

  (caffeineContent = 60 | item)

Given the definition from above, this would yield a value

  (caffeineContent = 60, drinkname = "latte", price = 2.5)

To overwrite the value in a field, when that field is already present, use the ``with'' notation:

  (item with drinkname = "capuccino")

This yields

  (drinkname="capuccino", price=2.5)

Records do not need to be declared; a given field name can appear in any record. The only requirement is that a given field is used with a consistent type, insofar as the the values may interact. It is acceptable to use a particular field name with different types, as long as the two are used separately--that is, as long as the same value will never be used in differently-typed expressions.

Polymorphic variants

A ``variant type'' is one that uses explicit ``tags'' (or ``labels'') to distinguish different sets of possible values as to their meaning. For example, a mode of transport may be either Automobile or Camel. If it is Automobile, we want to know what fuel it takes; if it is Camel, we want to know how many humps it has. In Links, values like these can be expressed as follows:

  Automobile(Diesel)
  Automobile(Unleaded)
  Camel(2)

The type that includes such values is written as follows:

  [|Automobile: [|Diesel:() | Unleaded:() | Biodiesel:()|] |
    Camel: Int |]

The boxy brackets [| |] delimit a variant type, and variant labels are separated by a pipe |. After each variant label, separated by a colon :, is the type of its contents--a Camel has a number of humps so its content type is Int, whereas the Automobile content type is another variant type, [|Diesel | Unleaded|].

In Links, a variant tag always begins with a capital letter. Any string beginning with a capital letter, used in a value context, denotes a variant label.

To inspect a variant value, use pattern matching. Pattern matching is accomplished using the switch form, which has a target expression and a case for each variant label. The following expression determines the effective number of humps of a transport (autos have no humps):

  switch (target) {
    case Automobile(fuelType) -> 0
    case Camel(humpCount) -> humpCount
  }

The expression expr is evaluated to produce a value of variant type; then the label is examined and one of the cases is chosen. The lowercase word following the variant label in a case is bound to the content of the target value (provided that case actually matches the target). This allows us to use the variable humpCount within the body of the Camel case. The body of a case (everything between the -> and the next case (if any) or the end of the switch) produces the result of the whole switch expression, and all case bodies of a switch must have the same type.

Type-checking will also ensure that all possible cases are matched by the switch expression. To handle arbitrary variant values, you can add an open case to the end of the switch:

  switch (target) {
    case Automobile(fuelType) -> 0
    case Camel(humpCount) -> humpCount
    case other -> 0
  }

Since other begins with a lowercase letter, it is a variable, which matches any value. Unlike the variables in the previous cases, which are wrapped inside variant labels, other is used here as the complete pattern to match for its case, so it will match anything. Patterns are tried in the order they are given, so the other case will not by selected unless the previous cases do not match.

Here we assume that nothing other than a camel has humps.

XML Quasiquotes

Also called ``quasis,'' these are the principal way to embed XML values within your program. A quasi is introduced by an XML start-tag such as <foo>. Inside a quasi, you may escape from the XML mode back into Links syntax, by enclosing the Links code within braces { }. When the quasi is evaluated, all its escaped expressions will be evaluated, and their values will be embedded within the constructed XML value. For example:

    <html>
      <ul>
        <li> First item: {item1} </li>
        <li> Second item: {item2} </li>
      </ul>
    </html>

Within an XML quasi, a variety of special features can be used to handle browser events. See Handling User Actions, below, for more information on these features.

To introduce a forest of XML nodes that have no mutually enclosing XML element, use the special <#>...</#> syntax:

    <#>
      Name: <b>{name}</b>
    </#>

Here, a sequence of nodes (first, an XML text node, and then a <b> element) becomes as a single expression. This value can be included directly in other XML quasis:

    var nameLine = <#> Name: <b>{name}</b> </#>;
    <div>
      {nameLine} <br />
      The Links Team
    </div>

Forms

Links provides two special attributes for programming with HTML forms: l:name, which binds the value of an input field to a variable, and l:onsubmit, with which you can supply an expression to be evaluated when the form is submitted.

   <form l:onsubmit="{say_hello(personName)}">
      What is your name? 
      <input l:name="{personName}"/>
   </form>

The personName variable bound by the l:name attribute is in scope in the l:onsubmit expression, which may also contain references to other variables in the enclosing scope. The values of these variables are passed from form to form automatically.

Unfortunately, the l: mechanism does not support any abstraction in building forms. For a more abstraction-friendly construct, see Formlets, describing a mechanism for abstracting form fragments.

Comparisons and Boolean Expressions

Comparisons are binary operations that yield a true/false value.

Boolean expressions can be combined using the boolean operators:

Conditionals

A conditional expression has a condition, a consequent, and an else clause. None of the three may be omitted.

    if (x == y)
        expr1
    else
        expr2

Curly braces can be wrapped around either clause, no matter how many sub-expressions they contain, and must be so wrapped if you want the clause to consist of more than one semicolon-separated expression.

Note that an if-else expression always returns a value in Links; the return values of the two branches must be of the same type, and both branches are required.

Variables

Variables are single-assignment in Links. The form

    var x = expr;
    etc

evaluates expr and binds the name x to the resulting value, within the expression etc.

Variable assignments have block scope. The following example

    var x = 1;
    if (condition) {
      var x = 2
    } else {
      var x = 3
    };
    print int_to_string(x);

prints 1 because the assignments to x within the if clauses only bind within those clauses. If you want the value printed to depend on the condition, you should assign to x from the result of the whole if expression:

    var x = if (condition) {
      2
    } else {
      3
    };
    print int_to_string(x);

Blocks

A sequence of variable bindings separated by semicolons is evaluated in text order, binding each value to the corresponding name in what follows. Variables cannot be rebound; later bindings of the same name shadow preceding ones:

    var x = 1;
    var y = 2;
    var x = 2;
    var z = x + y;    # z is now bound to 4

The scope of a binding is strictly within its immediate block. As a result, the following code may not do what you expect:

    var x = 0;
    if (a == b)
        var x = 1;
    else
        var x = 2;
    alert(x);

The value printed by alert will be 0. That's because the other two bindings only bind x within the corresponding clauses of the conditional. This may come as a surprise to programmers used to imperative languages.

Functions

Functions can be named or anonymous. Named functions look like this:

    fun foo(x, y, z)
    {
        # ... body 
    }

Anonymous functions just omit the name: fun (x) { x + 1 } is an expression that evaluates to an anonymous function value.

Function values, whether named or anonymous, are lexical closures; any variables free in the body must refer to bindings from a surrounding lexical scope. The smallest surrounding scope is chosen.

A function can be called by using its name, followed by a list of arguments in parentheses:

    foo(1, 2, 7)

This works whether foo is a function defined with a name, as fun foo(...) {...}, or a variable bound to a functional value, as

    var inc = fun (x) {x + 1};
    inc(7)

This block returns 8.

Any expression that evaluates to a function value can be called:

    (if (true) fun (x) { x + 1 }
     else fun (x) { x + 2 })(3)

Loops (List Comprehensions)

The principal loop construct in Links is the list comprehension:

    for (x <- source)
       body

Both the source and the body should be expressions that evaluate to lists.

The value of a comprehension is the concatenation of all the lists produced by evaluating the body, once for each element of source, and binding that element to the variable x. For example:

    var source_list = [1, 2, 3];
    for (x <- source_list)
        [ x*x ]

constructs a list of the squares of the values in source_list. Note that more than one value can be included in the body list:

    var source_list = [2, 3, 7, 8, 9, 55];
    for (n <- source_list)
        if (odd(n))
           [n, n+1]
        else
           [n]

This example returns [2, 3, 4, 7, 8, 8, 9, 10, 55, 56].

Other forms of looping can be implemented using tail recursion.

Filtering comprehensions

A comprehension can be filtered using the where clause.

    var source = [2, 3, 4, 5, 6, 7];
    for (x <- source)
    where (odd(x))
      [x+1]

returns [4, 6, 8].

A where clause is equivalent to a condition nested within a comprehension:

    for (x <- src)
    where (pred)
      expr

is equivalent to

    for (x <- src)
      if (pred)
        expr
      else []

where is a clause on for comprehensions: it cannot be used outside of a for.

Sorting comprehensions

The orderby clause on for comprehensions is used to sort the source before evaluating the body.

For example, suppose ``models'' is a list declared previously with type [(release_year:Int,model_number:Int,model_name:String)], describing models of an automobile make. Then the following will return a list of pairs describing the models, ordered by their year of release:

    for (m <- models)
    orderby (m.release_year)
      [(m.model_number, m.model_name)]

Multi-generator comprehensions

A comprehension can draw elements from more than one list. For each element produced by the first generator, Links iterates over all the items produced by the remaining generators.

For example:

   links>
     for (fruit <- ["apple", "orange", "banana"], i <- [1..4])
       [(i, fruit)];
   [(1, "apple"),  (2, "apple"),  (3, "apple"),  (4, "apple"),
    (1, "banana"), (2, "banana"), (3, "banana"), (4, "banana"),
    (1, "orange"), (2, "orange"), (3, "orange"), (4, "orange")] : [(Int, String)]

You can also impose an order on all the elements produced by the series of generators in a comprehension header, as in:

   links>
     for (fruit <- ["apple", "orange", "banana"], i <- [1..4])
     orderby (fruit)
       [(i, fruit)];
   [(1, "apple"),  (2, "apple"),  (3, "apple"),  (4, "apple"),
    (1, "banana"), (2, "banana"), (3, "banana"), (4, "banana"),
    (1, "orange"), (2, "orange"), (3, "orange"), (4, "orange")] : [(Int, String)]

Links will produce a list of tuple elements as dictated by the generators, then sort them, and finally evaluate the body expression for each element produced. Note that it is the source elements, not the body elements, which are sorted.

The effect of multi-generator comprehensions is much like that of nested comprehensions: the comprehension

     for (fruit <- ["apple", "orange", "banana"], i <- [1..4])
       [(i, fruit)];

behaves just like this one:

     for (fruit <- ["apple", "orange", "banana"])
       for (i <- [1..4])
         [(i, fruit)];

But multi-generator comprehensions are different from the nested counterparts when it comes to clauses such as orderby. This is because the orderby clause sorts the list of tuples produced by all the generators in the most recent comprehension header. When using nested single-generator comprehesions, you are sorting one series of elements which is then collected by another comprehension, for a result than may not obey the desired ordering. For example,

   links> for (fruit <- ["apple", "orange", "banana"])
            for (i <- [1..4])
            orderby (i)
              [(i, fruit)];
   [(1, "apple"),  (2, "apple"),  (3, "apple"),  (4, "apple"),
    (1, "banana"), (2, "banana"), (3, "banana"), (4, "banana"),
    (1, "orange"), (2, "orange"), (3, "orange"), (4, "orange")] : [(Int, String)]

Database access

To access the database, use database and table expressions, which produce first-class values representing, respectively, database connections and tables within such connected databases.

To connect to a given database instance using a particular database driver, use a database expression.

  database <database-name> <driver-name> <parameter-string>;

The <parameter string> is a colon-separated string whose interpretation is driver-dependent, but which usually includes the hostname, port, username and password. For example, to connect to the database ice_cream using the mysql driver, using the default host name and the port 3306, and bind the resulting connection to the variable db, use

  var db = database "ice_cream" "mysql" ":3306:web:";

The driver and parameter string default to the configuration values database_driver and database_args, so at a minimum you need only give the instance name:

  var db = database "ice_cream";

Supported drivers include postgres, mysql and sqlite. The <parameter-string> field is presently only used by the postgres and mysql drivers and has the form:

  host:port:user:password

When a database expression is evaluated on the server a connection is automatically opened if one is not already open.

Table-handles are created using table expressions, which need the table's name, its type, as a record, and the database connection (a value produced by a database expression as above):

  table <table-name> with <row-type> from <database-connection>;

For example, to create a handle to the table ice_cream with fields i and f both of type Int associated with the database db, and bind it to the variable fac_table, use:

  var parlors = table "ice_cream_parlors" 
                with (name : String, fans : Int) from db;

The variable parlors then has type TableHandle((name : String, fans : Int), (name : String, fans : Int), (name : String, fans : Int)). The three arguments to the TableHandle type constructor together identify the type of a row and its constraints (See Table constraints).

Using tables

To iterate over the items of a table--or just to fetch one row--use for comprehensions.

Since a table-handle in Links is different from a list, we cannot simply draw elements directly from a table as we draw them from a list. But a special form of generator accepts a table-handle as its source:

  var parlors = table "ice_cream_parlors" 
                with (name : String, fans : Int) from db;
  for (p <-- parlors)
  where (p.flavors > 1)
    [x]

This comprehension selects all the elements from the ice_cream_parlors table that serve more than one flavor.

The special long-arrow generator <-- accepts a table-handle, and not a list, as its right-hand side.

The long-arrow generator also specifies that the comprehension in which it participates must be executed by a single SQL query at run time. When an expression is executed as a query, it should be relatively efficient because, for example, any comprehension filtering should be performed by the database, before involving the Links runtime system. The database system may be able to do this efficiently by using its indexes.

When you don't want to specify that an expression becomes a query, you can extract all the rows of a table directly using the special function asList. The asList function takes a single argument, a table handle, and produces a list of all the rows in the table at the time when the function is applied.

For example, you may write

  for (p <-- parlors)
   where (p.flavors <= 20)
    [(name = p.name)]

to extract from the parlors table the names of those with fewer than 20 flavors. This might be compiled into an SQL query such as

  select p.name as name from parlors as p
  where p.flavors <= 20

The asList function extracts an entire table at once:

  sig asList : (TableHandle((|a::Base),_,_)) -> [(|a::Base)]
  fun asList(t) {for (z <-- t) [z]}

Thus the following expression gives the same result as the previous query for selecting shops with fewer than 20 flavors, but it does more work outside the database system:

  for (p <- asList(parlors))
   where (p.flavors <= 20)
    [(name = p.name)]

It would use the following simple SQL to get its raw data:

  select * from parlors

and the Links runtime system would filter the data itself.

Database modifications

Links can also modify database tables using insert, update and delete operations, which work similarly to the corresponding SQL statements.

For example, the Links expression

  insert t values (f1, ..., fk) rs

inserts the list of rows rs into the table t where rs must be a list of records with fields f1, ... fk.

The Links expression

  update (var r <-- t)
   where condition
   set (f1 = v1, ..., fn = vn)

updates the rows of table t that satisfy condition by setting each field fi to vi.

The Links expression

  delete (var r <-- t)
   where condition

deletes the rows of table t that satisfy condition.

Table constraints

Database fields can be given two kinds of special features: they can be made read-only and they can be defaulted or given default values when inserting.

A read-only field must not be included in an insert or an update. A defaulted field need not, and cannot, be included in an insert or update command.

To enable these special features on a table field, add a where clause to the table expression that lists field names with the corresponding keyword. For example:

  table "people" with (id:Int, name:String, nationality:String) 
                 where id readonly, nationality default from db

This returns a table handle for the table people with fields id and nationality, where the id field cannot be modified through this handle, and nationality has a default value so giving a value is optional when modifying the table.

The type assigned by Links to a table handle has three fields, indicating the readable fields, the writeable fields and the needed fields, in that order:

  links>
    var t = table "people" with (id:Int, name:String, nationality:String) 
                           where id readonly, nationality default from db;
  t = (table people) : 
    TableHandle((id:Int,name:String,nationality:String),
                (name:String,nationality:String),
                (name:String))

Only name and nationality are writeable, as id is read-only. Only name is needed when inserting, as nationality has a default and id is read-only.

Database abstraction

Links supports a powerful mode of abstraction over query fragments.

A simple example occurs when you want to filter a table, but using some abstract predicate given by another piece of code. For example, you are writing an API to access some database tables, and you want the API caller to be able to choose how to filter the rows.

A natural way to do this in Links is to represent the filter using a predicate function (one mapping table rows to booleans):

  fun selectedShops(pred) {
    for (p <-- parlors) where (pred(p)) e
  }

In this situation, since the long-arrow generator is used, the programmer demands that the entire comprehension be executed as a single SQL query--even though the predicate on which to filter is not known until runtime.

Still, Links guarantees at compile time that this is achievable--or gives a compile-time error if not. This involves analyzing the program to ensure that callers of selectedShops never pass a function that Links cannot translate to SQL in the context of the above query. If so, an error is generated, usually at the call site, explaining that the function passed was ``wild''--the Links term for a function that doesn't translate to SQL.

How to determine at a glance whether your function will translate to SQL? In brief, the only sorts of functions that should fail to translate are those that use ``wild'' primitives, and recursive functions.

To recognize wild and tame functions in the compiler's output, look at the function arrows. A straight arrow -> indicates a tame function, which can be translated to the database. A squiggly arrow ~> indicates a wild function, which cannot be compiled to the database. You can also use these arrow forms in the source, in type annoations, to constrain functions to the way you think they should act (or force the compiler to produce an error message if not).

A comprehension with a long-arrow generator (See Using tables) implicitly requires the entire comprehension to be translatable to a single SQL query, and this effectively requires every function it calls to be tame. To assert this requirement explicitly, or to make it elsewhere than a comprehension, use the query block form (See Query blocks).

Besides the issue of wildness, a query expression is not considered SQL-translatable if its result type is not the type of a flat relation--a list of records with fields of base type (Int, String, Float, and so on).

Query blocks

To assert explicitly that an expression--which need not be a comprehension--translates to a query, wrap it in a query block:

  query {
    e
  }

As with long-arrow comprehensions, the expression e must be ``tame'' and must have relation type: a list of records with fields of base type, or the compiler will give an error.

Use query assertions to offload parts of your computation to the database system, taking advantage of its efficient query-planning facilities.

Structural pattern matching

As well as lists and variants, pattern matching can also be performed on constants.

  switch (name) {
    case "Ezra" -> "Cooper"
    case "Jeremy" -> "Yallop"
    case "Phil" -> "Wadler"
    case "Sam" -> "Lindley"
  }

In common with other strongly-typed functional programming languages such as ML and Haskell, Links supports deep pattern matching.

  fun firstChildId(tree) {
    switch (tree:mu a.[|Node:(a, b, a) | Leaf|]) {
      case Node(Node(_, id, _), _, _) -> Some(id)
      case Node(Leaf, _, Node(_, id, _)) -> Some(id)
      case _ -> None
    }
  }

This function takes a labelled binary tree, and returns the id of the first child of the root, if it exists. The wildcard pattern _ matches any value. The type annotation is necessary if you want to constrain the function to only operate on labelled binary trees.

Using switch it is possible to dispatch according to the structure of a value. If the structure of a value is already known, then patterns can also be used in other contexts. Patterns can be used to bind function arguments.

  fun foo((x=a, y=b)) {a+b}
  foo((x=2,y=3))

This example returns 5. Patterns can also be used to bind local variables.

  {
    var language = Language("Links");
    var Language(name) = language;
    name
  }

This example returns "Links".

Regular expressions

Rudimentary regular-expression (regex) handling is provided for matching string against regex patterns.

A regular expression is written between slashes, as /this.is.a.regex/.

The operator =~ tests the string on its left-hand side against the regular expression on its right-hand side. For example:

    if ("haystack" =~ /ne+dle/)
      print("Found needle in haystack.")
    else
      print("No needle found in haystack.")

Presently the regex support is somewhat limited; only the following constructions are supported (e standing for any regex):

    a         (an unadorned character) match exactly that character
    [a-z0-9-] match any single character in the given range
    .         match any single characer
    e*        match e zero or more times
    e+        match e one or more times
    e?        match an empty string, or e exactly once.
    (e)       matches whatever e matches

Any of the special characters, * + ? ( ) [ and ], can be escaped with a backslash, in which case they match precisely that character, instead of having a special meaning.

Unlike most other regex systems, Links regexes are anchored on both ends. So, to match a substring, you need to add .* at both ends. For example:

    links> "Portobello" =~ /bell/
    false : Bool
    links> "Portobello" =~ /.*bell.*/
    true : Bool

Special feature: regexes and the database

Certain uses of regular expressions can be translated to an SQL expression involving the LIKE operator. For example, the expression

    for (uni <-- universities)
    where (uni.city =~ /.*edinburgh.*/)
      [uni]

should translate to an SQL query that selects all those unis from the universities table having edinburgh as a substring of their name column. The query might look like this:

    select id from t where t.name like '%edinburgh%'

Limitations

Currently, a regex in Links cannot be treated as a first-class value: a regex can only appear lexically as the right-hand side of a use of the ~ operator. The left-hand side may be any expression of String type.

There is presently no way to perform regex-based substitutions or to capture parts of a regex match for later use.


Typing

Types in Links

Links is a strongly-typed, statically-typed programming language.

This means that every value has a type, and every operation admits just certain types as arguments. The compiler will check that a program uses every value in a way consistent with a type. This way, for example, you can concatenate two lists but you can't concatenate two integers--it wouldn't make sense, and if you try to do it, the compiler will report an error up front. If a program passes the compiler, you can be certain that it doesn't have any type errors (modulo any bugs in the compiler!)

When you use Links through the interactive shell, it tells you the type of every result value. You can use this shell to get familiar with the types of various values. For example:

    links> 1 + 1;
    2 : Int
    links> { var name = "Gallileo"; "Hi, " ++ name };
    "Hi, Gallileo" : String
    links> (42, "The answer");
    (42, "The answer") : (Int, String)
    links> [2, 4, 6, 8];
    [2, 4, 6, 8] : [Int]
    links> (  price = 1.95, drinkName = "Latte" );
    (price=1.95,drinkName="Latte") : (drinkName:String,price:Float)
    links> Red(7);
    Red(7) : [|Red:Int | a|]
    links> Blue(7);
    Blue(7) : [|Blue:Int | a|]
    links> [(42, "The answer"),
            (7, "The number of wonders of the world")];
    [(42, "The answer"), (7, "The number of wonders of the world")] 
      : [(Int, String)]

Note that the type of an integer is Int, and the type of a string is String. The pair (42, "The answer") has a type that indicates the first part of the pair is an Int and the second part is a String; this type is written (Int, String).

A list type is written with square brackets around the type of its elements. Note that all elements of a list must be of the same type. This allows you to apply an operation to an arbitary element of the list and to know what type it is. If you need to mix different types within a list, you can do so by assigning variant tags to each.

By constrast, elements of a tuple can be of the same or of different types, but the number of elements of a tuple is fixed.

The type of a record indicates exactly which fields should be present, and what their types are. The type of a variant tells you what its label is, and what type is contained in that label. (Note that, when you have a function, it may yield a variant type that allows more than one label. Each label has a specific type for its content.)

When you try to evaluate an ill-typed expression, the system will give you a type error:

    links> [2, 4, 6, 8, "Who do we appreciate?"];
    <stdin>:1: Type error: Cannot unify abstract type 'List' with type 'Int'
    In expression: [2, 4, 6, 8, "Who do we appreciate?"].

Base types

The base types offered by Links are

  Bool  Int  Char  Float  Xml  Database

The type String is not really a base type, but rather an alias for [Char] (the type for lists of characters). This means that any general list operation will work on a String.

Type annotations

Links can infer type information for any program you give it. This means that you typically don't have to declare any types. But if you get a type error, it may not point to the part of the code where you really made the mistake.

To help the interpreter understand what you intend, and to help it pin down the error to the right place, you can give an optional type annotation on any expression. This is done using the colon as the type ascription operator, as follows:

    2 + 2 : Int            # OK
    "two plus two" : Int   # ERROR

When dealing with constants, as above, the type is always obvious. Type annotation is more useful when dealing with functions, whose type may be inferred in a way that's not obvious. Suppose for example that you have a function h that you know should always return a list of Ints. You can tell the compiler this, and it will check that it is the case:

  fun h(x, y) {
    f(x) ++ g(y) : [Int]
  }

This can be useful because f and g may in fact return types that are compatible (such as [String]), but are not what you expect. Since they are compatible, the function h will be type-correct; but elsewhere in the program, you may get a confusing type error involving h. Providing an annotation for h, like this, will force the compiler to tell you if h does not have the type you expect.

Type aliases

Sometimes it is convenient to define an alias for a type in order to avoid having to write out the whole type every time it is used. For instance, suppose you want a type for representing colours:

  [|Red | Orange | Yellow | Green | Blue | Indigo | Violet|]

Then you can write:

  typename Colour = [|Red | Orange | Yellow | Green | Blue | Indigo | Violet|];

and following this declaration you can use Colour in place of [|Red | Orange | Yellow | Green | Blue | Indigo | Violet|]. For instance, you can now define functions with the following signatures:

  sig showColour : (Colour) -> String
  sig invertColour : (Colour) -> Colour
  sig combineColours : (Colour, Colour) -> Colour

It is also possible to define parameterised type aliases.

  links> typename Pair(a, b) = (a, b)
  links> (1, true) : Pair (?, ?)
  (1, true) : Pair (Int, Bool)

It is even possible to parameterise type aliases over row variables.

  links> typename Record(r::Row) = (x:Int|r);
  Record = a.(x:Int|a)

Currently type aliases are only allowed at the top-level.

Type variables

Type variables in Links are written as lower case names.

  sig id : (a) -> a
  fun id(x) {x}

We distinguish between rigid type variables and flexible type variables. A rigid type variable (like the a in the id function above) is a proper polymorphic object-level type variable that can be instantiated to any type when its binding function is applied. A flexible type variable (also called a weak type variable or unification type variable) is a monomorphic meta-level variable that will be instantiated to a object-level type by the type inference algorithm. A flexible type variable is indicated by prefixing a type variable with a question mark ?. Flexible type variables are useful if you want to assert the shape of the type of an expression, but you want type inference to fill in some of the type for you.

  links> (1, (2, ((3, fun (x) {x}), "a")), true) : (Int, ?a, Bool);
  (1, (2, ((3, fun), "a")), true) : (Int, (Int, ((Int, (a) -> a), String)), Bool)

If you only want to use a flexible type variable once, then you don't need to name it.

  links> (1, (2, ((3, fun (x) {x}), "a")), true) : (Int, ?, Bool);
  (1, (2, ((3, fun), "a")), true) : (Int, (Int, ((Int, (a) -> a), String)), Bool

A rigid type variable that is only used once is written as _. This notation is also used by the pretty-printer.

  links> fun (x) {0};
  fun : (_) -> Int

Function types and effects

Function typess in Links are annotated with effects. The primitive form of a function type is:

  (A1, ..., An) {E}-> B

This is the type of a function that takes n arguments of types A1, ..., An, has effects E and returns values of type B.

The effects E are represented as a row type (as used in record and variant types). If an effect is present then that indicates that it might happen at run-time. If an effects absent then that indicates that it will definitely not happen at run-time.

Currently the only two effects that are tracked in Links are wild, which corresponds to code that cannot be run in the database, and hear:A which corresponds to code that can 'hear' messages of type A. In the future we plan to experiment with tracking other effects.

In an effort to make function types more readable, syntactic sugar is provided for a number of common classes of function arrow. As well as being available to the programmer, the syntactic sugar is also used by the pretty-printer. The meta-variable X stands for a rigid ( a or _) or flexible (?a or ?) row variable.

  -X->           {|X}->
  ->             {|_}->
  {E}~>          {wild,E}~>
  ~X~>           {|X}~>
  ~>             {|_}~>
  {:A}~>         {hear:A}~>
  {:A,E}~>       {hear:A,E}~>
  {:A|X}~>       {hear:A|X}~>

Kinds and subkinding

The Links type system has three kinds for classifying types and type variables: standard types and type variables have kind Type, rows and row variables have kind Row, and presence flags and presence variables have kind Presence.

In order to allow the type system to enforce the constraint that queries can only return lists of records of base type, Links also supports a very basic form of subkinding. Type variables and row variables can be annotated with a subkind. Type variables either have subkind Any, which means they can be instantiated to any type, or subkind Base which means they can only be instantiated to base types. The subkind Base is a subkind of Any. Similarly, row variables either have subkind Any or subkind Base. A row variable of subkind Base can only be instantiated with a row if all the field in the row are base types. Type inference introduces subkind annotations when inferring the type of a query.

  links> fun (e) {query {e}};
  fun : ([(|a::Base)]) -> [(|a::Base)]

The base types are: Int, Bool, Float, Char and String.

(In fact String is an alias for [Char]. This leads to a hole in our SQL compilation scheme, as it fails to disallow string manipulation code that cannot be compiled to SQL, such as the following:

  fun (t) {for (r <-- t) [(name=(for (c <- r.name) where (c <> 'a') [c]):String)]}

This problem will be fixed in a future version of Links by changing String to be a distinct base type.)


Features

Formlets

Links allows the abstraction of small fragments of an HTML form, called formlets, so that the submitted data can be packaged into more meaningful components, and the submitted data can be abstracted from its HTML-level representation.

As an example, you may wish to create a formlet to abstract the notion of a ``date entry field.'' Concretely, in HTML, this might take a variety of forms; it might use just one input field, or a field for each component (day, month, and year), or it might present a graphical control for the user to select a date.

The client code of the formlet wants to receive the date value in one lump, not as individual fields; furthermore the format of the data received by the client need not be the same as the format sent by the browser. As a result, the designer of the date formlet may change the HTML representation without changing the date representation that is returned to the client code. We say that the formlet is abstracted from the HTML representation of a form.

This abstraction is achieved in Links using a syntactic construct known as formlet/yields. An example date formlet might look like this:

  fun dateFormlet(msg) {
    formlet <#>
      {stringToXml(msg)} <br />
      Day:   {inputInt -> day}
      Month: {inputInt -> month}
      Year:  {inputInt -> year}
    </#>
    yields {
      Date(day, month, year)
    }
  }

This function takes a string msg from the caller (a bit of text to include before the input fields) and constructs an XML value including three input fields with the msg above them. The inputs are introduced using the formlet binding construct, which looks like this:

   { body -> x }

The expresion body (in the above example, inputInt) is itself a formlet; it has a concrete realization (inputInt is realized simply as an HTML <input> field) but will later (when the web user submits the form) produce a value, which will be bound to x. x is in scope throughout the yields expression.

The yields expression is where the data is packaged for the client. It is evaluated only when the form is submitted. The value emitted by the yields expression can further be bound by yet another formlet binding.

Thus, to continue the example, another piece of code can make use of the dateFormlet as follows:

  formlet
    <table>
    <tr><td>
      {dateFormlet("Arrival") -> arrival}
    </td><td>
      {dateFormlet("Departure") -> departure}
    </td></tr>
    </table>
  yields
    Itinerary(arrival, departure)

Here the values bound to arrival and departure are the values emitted by the respective yields bodies of the formlets returned by calls to dateFormlet; in this case they will be Date-tagged values such as Date(28, 2, 2007).

The type of a formlet/yields expression is Formlet(a), where a is the type returned by the yields body. The function dateFormlet thus has type String -> Formlet([|Date:(Int, Int, Int)|]).

A Formlet(a) value cannot be embedded in HTML until it has been ``rendered''. To render such a value, use the render function:

    render(formlet <#> {input -> x} </#> yields { x }, handler)

render has type (Formlet (a), (a -> XML)) -> XML.

Having rendered a formlet as a form, the resulting XML can be returned to the top-level, and thence served to the browser. When the user submits the form, all the submitted data will be processed by all the intermediate yields bodies, and finally the function handler will be called.

Concurrency

A Links program begins as a single thread of control but can fork into many processes by executing the spawn primitive:

  var newProcID = spawn { expression };

This starts a new process which begins by evaluating expression. The value of the expression is discarded if evaluation ever completes. spawn returns an identifier of the new process to the calling process. This identifier can be used to address messages to the new process, with the ! primitive (pronounced ``send''):

   procId ! msg

This appends the value msg to the mailbox for the process identified by procID. The return value is just (). The mailbox is FIFO, so if you know that some message is sent before some other message, you know they will be received in that order.

Each process's mailbox is given a static type according to the messages it expects to receive. Typically, a process will use variants to tag the various messages it can receive; for example, a process that can expects to be informed of passing comets and celebrity sightings might expect to receive either a value CelebritySighting(celebName, atVenue) or a value PassingComet(cometID, zenith, azimuth). This process's mailbox would be given the type [|CelebritySighting:(String, String) | PassingComet:(Int, Float, Float)|]

A process can receive messages using the recv function, which returns the next message in the current process's mailbox.

  var nextMsg = recv()

More commonly, however, you will want to dispatch on the received message's variant tag immediately. The receive construct makes this easy to do.

  receive {
    case CelebritySighting(celebName, atVenue) -> e1
    case PassingComet(cometID, zenith, azimuth) -> e2
  }

This removes the next message from the mailbox and does a switch on it. (See Polymorphic variants).

Located code and remote procedure calls

A client-side Links program can make seamless remote procedure calls to the server, and the server can in turn make remote procedure calls to the client. The server stores no state while the computation is proceeding at the client.

To use this feature, simply annotate functions with the keyword ``client'' or ``server''. For example:

  fun lookupUserID(username) server {
    for (user <- users)
    where (user.name = username)
      [user.id]
  }
  fun setPageBackground(color) client {
    doc = domGetDocumentRef();
    domSetAttributeRef(doc, "style", "background-color: " ++ color);
  }

A function annotated with ``server'' will always run on the server, even if it is called from code running on the client. Likewise for ``client'' functions.

A server call is implemented as a distinguished HTTP request. A client call is implemented as a special HTTP response body which the client-side code recognizes as a remote call.

Programs that neither contain client-annotated definitions nor make use of client primitives (such as the DOM functions described below) are processed differently by Links. In such cases plain HTML, not JavaScript, is sent to the browser, and all interaction takes place via standard HTML form submission.

Handling User Actions

A Links program usually runs on a client (browser) and a server. The client code can be notified when certain user events take place, such as pressing a key on the keyboard or clicking on a hyperlink. The code that responds to such events is associated with particular nodes in the DOM tree and is expressed through special XML attributes, called l-event attributes, attached to the relevant XHTML tags. This section defines the attributes and their use.

An l-event attribute begins with the prefix l: and corresponds to an event type as defined by the browser. The event type is determined by the attribute name simply by stripping off the l: prefix. Widely-supported l-event attributes include:

When an element containing one of these attributes is installed in the current DOM document (see Modifying the Web Page), the associated code is registered as a handler for the corresponding event type. If the element is removed from the DOM, its event handlers become inactive.

The content of one of these attributes should be a Links expression enclosed in curly braces, an expression of type () which may perform side-effects in response to the event. For example:

    <input type="button" value="Add Photo"
           l:onclick="{domAppendChild(container,
                                         <img src="icon.jpeg"/>)}" />

Interpreting the Event

In an l-event handler, there is a special variable in scope called event. This variable contains an object of type Event which can be accessed using the following API. This API corresponds to the Yahoo! Web UI library (http://developer.yahoo.com/yui/). These methods are implemented in a browser-independent way. These functions constitute the sole interface to the Event object: the object is not a record and has no directly-accessible fields.

getTarget

An event generally has a target, a DomNode at which the event has occurred (for example, a mouse-down event will have as its target the leaf node upon which the mouse was clicked). This method returns the target.

getTargetElement

This function behaves like getTarget, except that if the node is a text node (rather than an element node), it returns the node's parent (which must be an element node).

getFromElement
getToElement

Mouse-out and mouse-over events have two targets. The function getFromElement returns the element the mouse came from. getToElement returns the element from which the mouse came.

getPageX
getPageY

Some events refer to a particular point on the page. These methods return the x and y coordinates of that point, relative to the top-left corner of the page (right and down are the positive directions).

getTime

Modifying the Web Page

XML Quasiquote expressions construct XML values, but these values are not rendered by a browser immediately. XML values can be used to create DOM nodes, which in turn can be installed in the active document.

There are two ways to install nodes in the active document:

  1. An XML value returned by the final expression of a Links program is converted to a DOM node and installed as the (initial) active document. This is true in client context as well as server context.
  2. A set of primitives (beginning with dom-) are provided for making modifications to particular parts of the document. These operations only work in client context.

Elements that are not installed are not visible, nor do events occur on them, so their event handlers cannot fire. (See Handling User Actions.)

Links makes a distinction between XML trees, which are immutable, and DOM nodes, which are mutable objects. DOM nodes are owned by a DOM manager module, and all interaction with DOM nodes is through calls to that module, using objects call DomNodes. You can create DOM nodes by passing an XML tree: the XML constitutes a model for the DOM node, which is a rough copy of the XML. Likewise, from a DomNode, you can extract an XML image of the DOM node and its DOM subtree. The browser may have special semantics associated with DOM nodes, and may add or remove attributes and children without warning; XML values, however, are completely within the control of the Links program.

DOM operations

The operations for DOM interaction are as follows.

Operations ending with Ref work on DOM nodes; operations ending with Xml take an XML value and convert it to a DOM node before working with it.

insertBefore(xml, beforeNode)

Creates a sequence of nodes using xml as the image, and adds them as the previous sibling of the node beforeNode.

appendChildren(xml, parentNode)

Creates a sequence of nodes using xml as the image, and adds them as the after the last child of the node parentNode.

replaceNode(xml, node)

Replaces node with xml.

replaceDocument(xmlVal)

Replaces the entire document with xmlVal.

domInsertBeforeRef(moveRef, beforeRef)

Moves the node moveRef to become the previous sibling of the node beforeRef.

domAppendChildRef(newChildRef, parentRef)

Moves the node newChildRef to become the last child of the node parentRef.

removeNode(node)

Removes node from its current position, leaving it with no parent. This makes it an orphan.

replaceChildren(xml, node)

Replaces the children of node with xml.

swapNodes(node1, node2)

Swap node1 with node2.

getDocumentNode()

Return a reference to the document element, that is, the top-level of the installed document.

getNodeById(id)

Given a string ID value, finds the DOM node with that ID and returns it. If more than one node has that ID, the behavior is undefined.

getValue(node)

Translates the given DOM node into an XML value and returns that.

isNull(node)

Returns true if node is null.

isElementNode(node)

Returns true if node is not null and is an element node.

Note: when manipulating the DOM, care should be taken to ensure that a given ID value appears only once amongst the installed DOM nodes. Some operations may behave unreliably when IDs are duplicated. Orphaned DOM nodes may reuse ID values.

You can also access attributes of DOM nodes, and navigate the DOM tree, in a read-only fashion using the following operations:

domGetTagNameFromRef(ref)

Returns the tag name of ref as a string. For example, a div node would return the string "div".

domGetAttributeFromRef(ref, attrName)

Returns the value of the attribute attrName on the node ref. For example, on a node pointed to by ref with the XML representation

  <a href="more_info.html">More info</a>,

the call domGetAttributeFromRef(ref, "href") would return the string "more_info.html".

domGetParentFromRef(ref)

If the node pointed to by ref has a parent, this call returns a reference to that parent node.

domGetFirstChildFromRef(ref)

If the node pointed to by ref has children, this call returns a reference to the first child (in document order).

domGetNextSibingFromRef(ref)

If the node pointed to by ref has siblings occurring after it, this call returns a reference to the next sibling.

domGetChildrenFromRef(ref)

Returns a list of references to the child nodes of the node ref. NOT YET IMPLEMENTED

XML operations

For inspecting pure XML values, the following functions are provided:

getTagName(xml)

Returns the tag name of the given XML node, as a string. For example:

    getTagName(<h1>Dog Bites Man</h1>)

returns the string "h1". It is an error to apply this to an XML text node.

getTextContent(xml)

Given an XML text node xml, returns the text as a string. It is an error to apply this to an element node.

getAttributes(xml)

Fetches a list the list of attributes associated with the given XML value.

    getAttributes(<div class="sidebar" />)

returns [("class", "sidebar")]. It is an error to apply this to an XML text node.

hasAttribute(xml,attrName)

Returns true if the given XML value has the named attribute and false otherwise. For example

    hasAttribute(<div class="sidebar" />, "class")

returns true. It is an error to apply this to an XML text node.

getAttribute(xml,attrName)

Fetches the named attribute of the given XML value, as a string. For example

    getAttribute(<div class="sidebar" />, "class")

returns "sidebar". It is an error to apply this to an XML text node.

getChildNodes(xml)

Returns a list of the child nodes of this xml node.

Cookies

When in the context of an HTTP request, cookies can be manipulated using these library functions:

    getCookie : (String) ~> String
    setCookie : (String, String) ~> unit

Each takes the cookie name as its first parameter. getCookie returns the current value of the cookie, as it was passed with the HTTP request. setCookie takes, as its second parameter, the new value for that cookie. There is currently no way to control the path/domain restrictions of the cookie or the expiration date. All cookies set by these utilities are session cookies and expire when the browser is closed. More flexible support for cookies is planned for future work.


Library functions

Links has various built-in library functions. A full list of these, along with their types, can be obtained by typing @builtins; in the interactive shell.

Explicit type conversions

 stringToInt      : (String) -> Int
 intToFloat       : (Int) -> Float
 intToString      : (Int) -> String
 floatToString    : (Float) -> String
 stringToXml      : (String) ~> Xml
 intToXml         : (Int) ~> Xml
 floatToXml       : (Int) ~> Xml

These function convert between values of different types.

 ord              : (Char) -> Int
 chr              : (Int) -> Char

The ord and chr functions converts between a character and its ASCII value as an integer.

Negation

 not              : (Bool) -> Bool
 negate           : (Int) -> Int
 negatef          : (Float) -> Float

Character classification

 isAlpha          : (Char) ~> Bool
 isAlnum          : (Char) ~> Bool
 isLower          : (Char) ~> Bool
 isUpper          : (Char) ~> Bool
 isDigit          : (Char) ~> Bool
 isXDigit         : (Char) ~> Bool
 isBlank          : (Char) ~> Bool

(isXDigit returns true for hexadecimal digits)

Case conversion

 toUpper          : (Char) -> Char
 toLower          : (Char) -> Char

Floating point functions

 floor            : (Float) -> Float
 ceiling          : (Float) -> Float
 cos              : (Float) -> Float
 sin              : (Float) -> Float
 tan              : (Float) -> Float
 log              : (Float) -> Float
 sqrt             : (Float) -> Float

Miscellaneous

 print            : (String) ~> ()

Print a string to the standard output. (This function has undefined behaviour when Links is running as a CGI script.)

 error            : (String) ~> a
 
Raise a fatal error.
 debug            : (String) ~> ()

Output a string to stderr. When running a client function this will output to a special window only if links is run with the -d flag. When running a server function stderr is typically forwarded to the web server's error log.

 sleep            : (Int) ~> ()

Wait for the specified number of seconds.

 exit             : (a) ~> b

Exits the program, using the argument to exit as the program's return value.

(FIXME: this can be ill-typed.)


Standard library (prelude)

A set of standard Links functions is defined in the file prelude.links. This file is loaded automatically whenever Links runs a program. The functions it defines are explained below.

List utilities

Web-related features

These functions only make sense in the context of web programs.

Formlet library

These routines can be used with the formlet/yields syntax described in Formlets. Familiarize yourself with that mechanism before trying to read about these functions.


Running links

Links can be run either as a CGI script through a web server such as Apache, or from the command line.

When run as a CGI script Links must be passed a filename as an argument:

  links filename

The source code to evaluate can be specified in several ways:

  links                # run interactively
  links -e exp         # evaluate exp, print its result
  links filename       # run filename

Command line flags can be used to enable/disable some features:

  -O                   enable optimisation
  -n                   don't print types
  -d                   debug mode (for hacking on links itself)

A configuration file can be chosen using this command-line option:

  --config=<filename>  set a configuration file

These other options can be fun, too:

  --pp=<command>          run all input through UNIX <command> before parsing
  --precompile=<filename> precompile and store a cached version of <filename>
  --dump=<filename>       find and print out the cached version of <filename>

The interactive shell

The interactive shell allows server-side code to be entered dynamically. In order to evaluate an expression it must be terminated with a semi-colon followed by a new line:

  ./links
  links> 1+2;
  3 : Int
  links>

The interpreter outputs the resulting value and its type.

Directives

The interpreter supports a number of directives. Typing one of these at the interactive loop has an immediate effect.

@directives

list available directives

@settings

print available settings (see Configuration settings)

@set

change the value of a setting (see Configuration settings)

@builtins

list builtin functions and values

@quit

exit the interpreter

@typeenv

display the current type environment

@env

display the current value environment

@load

load in a Links source file, replacing the current environment

@withtype

search for functions that match the given type

Configuration settings

Links offers a number of settings which can affect the behavior of the interactive loop or the web interface.

The available settings can be discovered in the interactive loop using the @settings directive:

    links> @settings;
    
    User settings
     show_unification          false
     show_typechecking         false
    ...

A setting can be modified during an interactive session using the @set directive:

    links> @set show_typechecking true;

Settings can also be configured using a configuration file. To select a configuration file, use the command-line option --config:

    $ links --config=<filename>

A configuration file is just a set of lines, each of which contains a setting formatted as <setting-name>=<setting-value>. For example:

    debug=on
    database_driver=postgresql

This is particularly useful for configuring settings for a web application, where directives cannot otherwise be given.