add draft section of mmmfs

author: s-ol <s-ol@users.noreply.github.com> 2019-10-24 17:52:05 +0000
committer: s-ol <s-ol@users.noreply.github.com> 2019-10-24 17:52:05 +0000
commit: 8ee96a4cdbd1e10f45186fd0119623f743d9eb33 (patch)
tree: aee43ba17c7b07c9f9356395d18b1dceceda8280 /root
parent: add mermaid graph convert (diff)
download: mmm-8ee96a4cdbd1e10f45186fd0119623f743d9eb33.tar.gz
mmm-8ee96a4cdbd1e10f45186fd0119623f743d9eb33.zip
3 files changed, 226 insertions, 0 deletions
diff --git a/root/articles/mmmfs/mmmfs/mainstream_fs/text$mermaid-graph b/root/articles/mmmfs/mmmfs/mainstream_fs/text$mermaid-graph
new file mode 100644
index 0000000..3227e47
--- /dev/null
+++ b/root/articles/mmmfs/mmmfs/mainstream_fs/text$mermaid-graph
@@ -0,0 +1,11 @@
+graph TD
+  Documents --> Movies --> m1{{A Movie.mp4}}
+  Documents --> Pictures --> vacation[Summer Vacation]
+  vacation --> v1{{1.jpg}}
+  vacation --> v2{{2.jpg}}
+  Pictures --> Projects
+  Projects --> p1{{1.jpg}}
+  Projects --> p2{{2.jpg}}
+
+  classDef file fill:#f9f;
+  class m1,v1,v2,p1,p2 file;
diff --git a/root/articles/mmmfs/mmmfs/schematic/text$mermaid-graph b/root/articles/mmmfs/mmmfs/schematic/text$mermaid-graph
new file mode 100644
index 0000000..7b4f6f4
--- /dev/null
+++ b/root/articles/mmmfs/mmmfs/schematic/text$mermaid-graph
@@ -0,0 +1,21 @@
+graph TD
+  Documents --> Movies --> m1[A Movie.mp4]
+  m1 --> m1.f{{video/mp4}}
+
+  Documents --> Pictures --> vacation[Summer Vacation]
+  vacation --> slideshow{{slideshow: script -> UI}}
+  vacation --> gallery{{gallery: script -> UI}}
+  vacation --> v1[1.jpg]
+  vacation --> v2[2.jpg]
+  v1 --> v1.f{{image/jpeg}}
+  v2 --> v2.f{{image/jpeg}}
+
+  Pictures --> Projects
+  Projects --> p1[1.jpg]
+  Projects --> p2[2.jpg]
+  p1 --> p1.f{{image/jpeg}}
+  p2 --> p2.f{{image/jpeg}}
+
+  classDef default fill:#f00,stroke:#333,stroke-width:4px;
+  classDef file fill:#f9f;
+  class slideshow,gallery,m1.f,v1.f,v2.f,p1.f,p2.f file;
diff --git a/root/articles/mmmfs/mmmfs/text$markdown.md b/root/articles/mmmfs/mmmfs/text$markdown.md
new file mode 100644
index 0000000..f98f0fb
--- /dev/null
+++ b/root/articles/mmmfs/mmmfs/text$markdown.md
@@ -0,0 +1,194 @@
+`mmmfs` seeks to improve on two fronts.
+
+One of the main driving ideas of the mmmfs is to help data portability and use by making it simpler to inter-operate with different data formats.
+This is accomplished using two major components, the *Type System and Coercion Engine* and the *Fileder Unified Data Model* for unified data storage and access.
+
+# The Fileder Unified Data Model
+The Fileder Model is the underlying unified data storage model.
+Like almost all current data storage and access models it is based fundamentally on the concept of a hierarchical tree-structure.
+
+<mmm-embed path="mainstream_fs">schematic view of an example tree in a mainstream filesystem</mmm-embed>
+
+In common filesystems as pictured, data can be organized hierarchically into *folders* (or *directories*),
+which serve only as containers of *files*, in which data is actually stored.
+While *directories* are fully transparent to both system and user (they can be created, browser, listed and viewed by both),
+*files* are, from the system perspective, mostly opaque and inert blocks of data.
+Some metadata is associated with them (filesize, access permissions),
+but notably the type of data is generally not actually stored in the filesystem,
+but is determined loosely based on multiple heuristics based on the system and context, notably:
+- Suffixes in the name are often used to indicate what kind of data a file should contain.
+  However there is no standardization over this, and often a suffix is used for multiple incompatible versions of a file-format.
+- Many file-formats specify a specific data-pattern either at the very beginning or very end of a given file.
+  On unix systems the `libmagic` database and library of these so-called *magic constants* is commonly used to guess the file-type based on
+  these fragments of data.
+  However, since not all file-formats use magic constants, and since the location and value of the magic constants varies between constants,
+  files can often (considered to) be valid in multiple formats at the same time.
+  [TODO: quote: "Abusing file formats; or, Corkami, the Novella", Ange Albertini, PoC||GTFO 7]
+- on UNIX systems files to be executed are checked by a variety of methods to determine which format would fit.
+  for script files, the "shebang" (`#!`) can be used to specify the program that should parse this file in the first line of the file.
+  [@TODO: src: https://stackoverflow.com/questions/23295724/how-does-linux-execute-a-file]
+ 
+It should be clear already from this short list that to mainstream operating systems, as well as the applications running on them,
+the format of a file is almost completely unknown and at best educated guesses can be made.
+
+Users renaming extensions:
+  https://askubuntu.com/questions/166602/why-is-it-possible-to-convert-a-file-just-by-renaming-its-extension
+  https://www.quora.com/What-happens-when-you-rename-a-jpg-to-a-png-file
+
+In mmmfs, the example above might look like this instead:  
+<mmm-embed path="schematic">schematic view of an example mmmfs tree</mmm-embed>
+
+
+Superficially, this may look quite similar: there is still only two types of nodes (referred to as *fileders* and *facets*),
+and again one of them, the *fileders* are used only to hierarchically organize *facets*.
+Unlike *files*, *factes* don't only store a freeform *name*, there is also a dedicated *type* field associated with every *facet*,
+that is explicitly designed to be understood and used by the system.
+
+Despite the similarities, the semantics of this system are very different:
+In mainstream filesystems, each *file* stands for itself only;
+i.e. in a *directory*, no relationship between *files* is assumed by default,
+and files are most of the time read or used outside of the context they exist in in the filesystem.
+
+In mmmfs, a *facet* should only ever be considered an aspect of its *fileder*, and never as separate from it.
+A *fileder* can contain multiple *facets*, but they are meant to be alternate or equivalent representations of the *fileder* itself.
+Though for some uses it is required, software in general does not have to be directly aware of the *facets* existing within a *fileder*,
+rather it assumes the presence of content in the representation that it requires, and simple requests it.
+The *Type Coercion Engine* (see below) will then attempt to satisfy this request based on the *facets* that are in fact present.
+
+Semantically a *fileder*, like a *directory*, also encompasses all the other *fileders* nested within it (recursively).
+Since *fileders* are the primary unit of data to be operated upon, *fileder* nesting emerges as a natural way of structuring complex data,
+both for access by the system and applications, as well as the user themself.
+
+# The Type System & Coercion Engine
+As mentioned above, *facets* store data alongside its *type*, and when applications require data from a *fileder*,
+they specify the *type* (or the list of *types*) that they require the type to be in.
+
+In the current iteration of the type system, types are simple strings of text and loosely based on MIME-types [TOOD: quote RFC?].
+MIME types consist of a major- and minor category, and optionally a 'suffix'.
+Here are some common MIME-types that are also used in mmmfs:
+
+- `text/html` and `text/html+frag` (mmmfs only)
+- `text/javascript`
+- `image/png`
+- `image/jpeg`
+
+While these types allow some amount of specifity, they fall short of describing their content especially in cases where formats overlap:
+Source code is often distributed in `.tar.gz` archive files (directory-trees that are first bundled into an `application/x-tar` archive,
+and then compressed into an `application/gzip` archive).
+Using either of these two types is either incorrect or insufficient information to properly treat and extract the contained data.
+
+To mitigate this problem, mmmfs *types* can be nested. This is denoted in mmmfs *type* strings using the `->` symbol, e.g. the mmmfs-types
+`application/gzip -> application/tar -> dirtree` and `URL -> image/jpeg` describe a tar-gz-compressed directory tree and the URL linking to a JPEG-picture respectively.
+
+Depending on the outer type this nesting can mean different things:
+for URLs the nested type is expected to be found after fetching the URL with HTTP,
+compression formats are expected to contain contents of the nested types,
+and executable formats are expected to output data of the nested type.
+
+It is a lot more important to be able to accurately describe the type of a *facet* in mmmfs than in mainstream operating systems,
+because while in the latter types are mostly used only associate an application that will then prompt the user about further steps,
+mmmfs uses the *type* to automatically find one or more programs to execute to convert or transform the data stored in a *facet*
+into the *type* required by the application.
+
+This process of *type coercion* uses a database of known *converts*, that can be applied to data.
+Every *convert* consists of a description of the input *types* that it can accept, the output *type* it would produce for a given input type,
+as well as the code for actually converting a given piece of data.
+Simple *converts* may simply consist of a fixed in and output type,
+such as for example this *convert* for rendering Markdown-encoded text to a HTML hypertext fragment:
+
+    {
+      inp: 'text/markdown'
+      out: 'text/html+frag'
+      transform: (value, ...) ->
+        -- implementation stripped for brevity
+    }
+
+Other *converts* on the other hand may accept a wide range of input types:
+
+    {
+      inp: 'URL -> image/.*'
+      out: 'text/html+frag'
+      transform: (url) -> img src: url
+    }
+
+This convert uses a Lua Pattern to specify that it can accept an URL to any type of image,
+and convert it to an HTML fragment.
+
+By using the pattern substitution syntax provided by the Lua `string.gsub` function,
+converts can also make the type they return depend on the input type, as is required often when nested types are unpacked:
+
+    {
+      inp: 'application/gzip -> (.*)'
+      out: '%1'
+      transform: (data) ->
+        -- implementation stripped for brevity
+    }
+
+This *convert* accepts an `application/gzip` *type* wrapping any other *type*, and captures that nested type in a pattern group.
+It then uses the substituion syntax to specify that nested type as the output of the conversion.
+For an input *type* of `application/gzip -> image/png` this *convert* would therefore generate the type `image/png`. 
+
+To further demonstrate the flexibility using this approach, consider this last example:
+
+    {
+      inp: 'text/moonscript -> (.*)'
+      out: 'text/lua -> %1'
+      transform: (code) -> moonscript.to_lua code
+    }
+
+This *convert* transpiles MoonScript source-code into Lua source-code, while keeping the nested type
+(in this case the result expected when executing either script) the same.
+
+In addition to the attributes shown above, every *convert* is also rated with a *cost* value.
+The cost value is meant to roughly estimate both the cost (in terms of computing power) of the conversion,
+as well as the accuracy or immediacy of the conversion.
+For example, resizing an image to a lower size should have a high cost, because the process is computationally expensive,
+but also because a smaller image represents the original image to a lesser degree.
+Similarily, an URL to a piece of content is a less immediate representation than the content itself,
+so the cost of a *convert* that simply generates the URL to a piece of data should be high even if the process is very cheap to compute.
+
+Cost is defined in this way to make sure that the result of a type-coercion operation reflects the content that was present as accurately as possible.
+It is also important to prevent some nonsensical results from occuring, such as displaying a link to content instead of the content itself because
+the link requires less steps to create than completely converting the content does.
+
+***
+
+Type coercion is implemented using a general pathfinding algorithm, similar to A*.
+First, the set of given *types* is found by selecting all *facets* of the *fileder* that match the *name* given in the query.
+The set of given *types* is marked in green in the following example graph.
+
+From there the algorithm recursively checks whether it can reach other types by applying all matching *converts* to the type
+that is cheapest to reach, excluding any types that have already been exhaustively-searched in this way.
+All types it finds, that have not yet been inserted into the set of given types are then added to the set,
+so that they may be searched as well.
+
+The algorithm doesn't stop immediately after reaching a type from the result set,
+it continues search until it either completely exhausts the result space,
+or until all non-exhaustively searched paths are already higher than the maximum allowed path.
+This ensures that the optimal path is found, even if a more expensive path is found more quickly initially.
+
+```
+graph LR
+graph LR
+  md_lua[text/lua -> text/markdown]
+  md[text/markdown]
+  moon[text/moonscript -> fn -> mmm/dom]
+  lua[text/lua -> fn -> mmm/dom]
+  fn[fn -> mmm/dom]
+  dom[mmm/dom]
+  moon_url[URL -> text/moonscript -> fn -> mmm/dom]
+  lua_url[URL -> text/lua -> fn -> mmm/dom]
+
+  md_lua -- cost: 1 --> md -- cost: 2 --> dom
+  moon -- cost: 5 --> moon_url
+  moon -- cost: 1 --> fn -- cost: 2 --> dom
+  moon -- cost: 2 --> lua -- cost: 5 --> lua_url
+  lua -- cost: 1 --> fn
+  moon_url -- cost: 10 --> dom
+  lua_url -- cost: 10 -->dom
+
+  classDef given fill:#ada;
+  classDef path stroke:#ada;
+  linkStyle 3,1,0,4 stroke:#8d8,stroke-width:2px
+  class md_lua,moon given
+```
author	s-ol <s-ol@users.noreply.github.com>	2019-10-24 17:52:05 +0000
committer	s-ol <s-ol@users.noreply.github.com>	2019-10-24 17:52:05 +0000
commit	8ee96a4cdbd1e10f45186fd0119623f743d9eb33 (patch)
tree	aee43ba17c7b07c9f9356395d18b1dceceda8280 /root
parent	add mermaid graph convert (diff)
download	mmm-8ee96a4cdbd1e10f45186fd0119623f743d9eb33.tar.gz mmm-8ee96a4cdbd1e10f45186fd0119623f743d9eb33.zip