NuPIC

Numenta Platform for Intelligent Computing

08 Jun 2015

HTM.java Receives New Network API

Greetings Earthlings! (…and otherwise affiliated) :-)

We’re [ insert understatement here ]excited?” to announce the completion of the Network API (NAPI) for the Java™ port of NuPIC. This addition to HTM.java will usher in a whole new realm of possibilities for application integration and for streaming data applications.

Stream of consciousness you say? Maybe someday… :-) For now let us be content with easy integration of Numenta’s learning, prediction, and anomaly detection algorithms into new forward-thinking Java applications; distributed data applications; and enterprise ecosystems.

Until now, Java users of NuPIC had to be content with piecing together individual algorithmic components by supplying their own duct-tape, stitching and glue. This is no longer necessary, as the NAPI is very robust; has lots of features, and most importantly - is hypnotizing-ly easy to work with!

Here’s an example of the code it takes to get a full featured network up and running:

Parameters p = NetworkDemoHarness.getParameters();
p = p.union(NetworkDemoHarness.getNetworkDemoTestEncoderParams());

Network network = Network.create("Network API Demo", p)
    .add(Network.createRegion("Region 1")
        .add(Network.createLayer("Layer 2/3", p)
            .alterParameter(KEY.AUTO_CLASSIFY, Boolean.TRUE)
            .add(Anomaly.create())
            .add(new TemporalMemory())
            .add(new SpatialPooler())
            .add(Sensor.create(FileSensor::create, SensorParams.create(
                Keys::path, "", ResourceLocator.path("rec-center-hourly.csv"))))));

network.start();

…and that’s it!

For less “ear-chattering” and more “quick-starting”, see the Quick Start Guide & Docs


Decided to stick around eh?

Well, since you’re here, let’s talk more about how the NAPI is designed and a bit more about its features.

In HTM.java, the hierarchical entities seek to provide corollaries to actual biological structures. A Network can be thought of like a single neocortex, while a Region can be thought of as a location in the neocortex that is sensitized to a particular function, behavior or memory. Lastly, Layers are metaphors for actual layers in the neocortex such as the 6 layers identified by neurological researchers. Thus, in HTM.java, the Layer is the container for the Connections object and eventually will hold the column/cell matrix that the Connections object now holds. (Hint!: Work to be done here). The SpatialPooler and TemporalMemory describe how connections are formed and their associations, and algorithmic data flow - but are not physical members of a hierarchy - they describe how data flows.

At the top level there is the Network. A Network is a container of Regions, which in turn contain Layers, which in turn contain Columns, Cells, Dendrites and Synapses - and are acted upon by the different algorithms (i.e. SP, TM etc.). A Network can be both listened to and it can have data submitted to it. In fact, this is its normal usage pattern.

Regions are collections of Layers. While they are not an optional construct within a Network, they also don’t do all that much. Regions are best thought of as “groups” of algorithmic hierarchies that may in the future, specialize on types of knowledge or types of behavior or other functionality. They are here mostly to provide a corollary to Regions in the neocortex - but mostly they are meant to identify and distinguish layers of functionality specialized for a particular purpose.

The last level in the hierarchy is the Layer. Layers contain the column/cell matrix (Connections) and house all the algorithmic and functional components. This is where the heart of the NAPI beats. Layers house most of the functionality - automatically forming computational units implemented via RxJava Observables which know how to connect one algorithm to another, transforming its data in the process in order to input data to the next succeeding algorithm (whatever it may be) in the format that that algorithm expects. Layers emit Inference objects when subscribed to (either via their Layer.subscribe() method or by retrieving their Observable<Inference> object, and calling Observable.subscribe() on it); and may have data directly submitted to them via the onNext() method of their Observable. To retrieve their Observable, simply call Layer.observe() which returns an Observable<Inference> object. While the Layer object is flexible enough to function this way, the typical usage pattern will be to create a Network object (with a Sensor connected at the bottom), start it; then subscribe to it via its Observable (Network.observe()), in order to receive its emitted Inference objects. All hierarchy objects may be “listened” to by calling (Network.observe(), Region.observe(), or Layer.observe() / Layer.subscribe()).

The Inference object emitted contains a vast amount of information such as the predicted columns, the previously predicted columns, the activated columns etc. In addition data may be retrieved from a given Layer following each cycle also.


Usage (A bit of FRP)

First, as you can see above, entire networks can be constructed using the “fluent” style of declaration (all one method call chain). This reduces boilerplate typing and clarifies the hierarchical structure of the api’s constructs.

Secondly, the whole framework is designed with Streaming data and Asynchronous usage in mind. The network’s Sensors were implemented using the Java 8 Streams API and will be totally asynchronous ready when work on the various Encoders is completed to make them thread safe.

HTM.java’s Stream usage has been customized to include “batching” of input - speeding up parallel processing by up to 30 or 40%. The BatchedCsvStreamBenchmark test class can be run as a demonstration. (Your mileage may vary.)

Finally, the RxJava Reactive Extensions library was used throughout to simplify the connection of algorithms within a Layer.

Think about it…

Each Layer can contain one or more components, and it is unknown which ones will exist in any given Layer. Next, these components all output different data and take in different data in different combinations. So a framework that allows for all of this must be adaptable and able to transform data based on the dynamic existence or non-existence of any particular algorithm or component!

Additionally, because at all levels (except the bottom input level), Inference objects are passed between components, layers and regions; RxJava Observables can be obtained at any point in the hierarchy, and subscribed to to receive output data –or– they can be combined, operated upon and manipulated in Map-Reduce form! How exciting is that?

This means that you can take output from any point and operate on it (combine, zip, merge, delay etc.) in your own applications using RxJava idioms!


A bit about input / output (FEED ME!)

This is discussed in more detail in the wiki linked below, but we just wanted to touch on this briefly.

Networks can be fed either by connecting them to receive streaming data “automatically” by declaring a FileSensor or URLSensor. Additionally you can call Layer.compute() directly with either an int[], String[] (not fully implemented), Map (like an encoder mapping), or ManualInput object. Lastly, you can use an ObservableSensor to either feed in data manually/programmatically –or– connect to other Observable emitters within your applications!

ManualInput objects inherit from Inference (the interface received on output and passed between layers, regions, and other components)

At the other end, Networks can be subscribed to in order to receive data “out of the top” (And so can Regions, and Layers - see Region.observe() and Layer.observe() ).

Here’s a brief look at the Observable interface’s simplicity:

Observable<Inference> o = network.observe(); // This Observable can be used a million different ways!
o.subscribe(new Subscriber<Inference>() {
    @Override 
    public void onCompleted() {
        // Do finalization work here if desired
    }

    @Override 
    public void onError(Throwable e) {
        // Error handling here
    }

    @Override public void onNext(Inference i) {
        // Called on every cycle of the Network - work is done here
    }
});

Where do I fit in?

Now that the ground work has been laid, there is a TON of things to do! If you would like to get your name in bright lights, or simply help out with HTM.java drop a line to the hackers mailing list and offer to help - all are welcome (at any level!).

For more in depth information (Work in progress), start here with the NAPI - In-Depth Component Overview; which leads into a tutorial series of wiki pages and docs.

To dig right in, look at the Quick Start Guide


Lastly…

I’d like to mention that I’ve had a lot of help from my team members and the leadership at Cortical.io with organizing my work, and also being generously granted the time to work on this project! (I have such a great job!) - Thank you!

Happy Hacking!

David Ray
Java Solutions Architect
Cortical.io

Next: Quick Start Guide

Comments on Reddit


Top

10 Feb 2015

HTM.java Receives Benchmark Harness

They say in order to lead, one must have someplace to go. It’s also true that in order to arrive, one must have departed from someplace (duh! :-P). In software optimization, knowing where it is one came from, (and establishing baselines), is a very large part of the battle; and the same things that make Java™ such an attractive and ubiquitous platform, also make it one of the hardest environments to benchmark.

Unlike C, which is a statically compiled language, Java is very dynamic and undergoes very aggressive optimization and runtime profiling while it compiles down to native code on the fly! As such, there are many pitfalls one can come across when benchmarking Java code; which is why JMH was chosen for HTM.java’s benchmark tool. JMH was developed as part of the OpenJDK project (the incubator Oracle draws from to create their “official” JDK), and while it doesn’t make benchmarks infallible, it will help surmount some of the hardest issues benchmarking Java can present (such as accounting for optimizations: see Loop Unrolling, Dead Code Elimination, and Escape Analysis)

For more detailed information about benchmarking Java please watch: the talk by Aleksey Shipilev, it’s a crucial and informative watch.

Uhm… Back to HTM.java, right?

Right! :)

HTM.java’s new benchmarking package is not the end of the road - it is just the beginning. It is a place to “depart” (using the previously established vernacular). The package can be found in the “src/jmh/java” directory, and is comprised of 4 classes to start off with:

Gradle Build Integration

The build.gradle file, is configured to run the above benchmarks (they only take one minute) during Travis CI, continuous integration builds.

Fun Fact: The SpatialPooler benchmark is run 1000 times for the “warm up” and 1000 times for the “timing run”; likewise the TemporalPooler is run 45,000 times for each) A lot can happen in one minute!

These can also be run on the command line using:

Human readable results can be found in: “<git source dir>/build/reports/jmh/human.txt”.

The jar file created by the “gradle check” command can be found at: “<source dir>/build/libs/htm.java-0.40-jmh.jar”. The following command can be executed (from inside the “<source dir>/build/libs” directory) to simply run the benchmarks after “gradle check” has been run at least once.

java -jar htm.java-0.40-jmh.jar

The “0.40” part of the file name may change when new versions of HTM.java are released.

You can contribute too!

In addition, any classes found in the “src/jmh/java” directory will be automagically run during “gradle check”.

The “src/jmh/java” directory was chosen by the author of the jmh gradle plugin as a means of allowing the integration of benchmarks into any project without having to create a separate project directory structure. (Big Kudos to Cedric Champeau) Thanks!

In addition, there is a text file located here entitled “jmh_defaults.txt” which lists all the jmh command line flags the gradle plugin supports - for your convenience!

Happy Benchmarking!

David Ray
Lead Programmer, HTM.java

Comments on Reddit


Top

03 Dec 2014

HTM in Java!

Introducing the feature-complete htm.java project!

Thanks to a lot of effort by David Ray, the NuPIC hacker community, and Numenta’s own development staff, a fully usable version of NuPIC (minus swarming and OPF functionality ➟ coming soon!) is now available in Java.

This port is 100% functionally equivalent to NuPIC’s Network API.

See the complete javadocs here!

Bringing NuPIC to Java is an important milestone in NuPIC technology due to the size and significance of Java’s user base. Because Java is the world’s most used programming language, HTM now has the advantage of being exposed to an extraordinary amount of new developers and users.

htm.java is easy to setup and configure, and because the JVM is such a common runtime, that means instant availability on platforms like Windows and all Linux flavors without the use of a virtual machine (not to mention mobile devices!)

The community is fully committed to adding all the support tools and infrastructure the Python version enjoys, and those are the next milestones to be worked on. There is already a “HelloSP” example created by a community member. Here is the constructor:

/**
 * 
 * @param inputDimensions         The size of the input.  {m, n} will give a size of m x n
 * @param columnDimensions        The size of the 2 dimensional array of columns
 */
HelloSP(int[] inputDimensions, int[] columnDimensions) {
    inputSize = 1;
    columnNumber = 1;
    for (int x : inputDimensions) {
        inputSize *= x;
    }
    for (int x : columnDimensions) {
        columnNumber *= x;
    }
    activeArray = new int[columnNumber];
    
    parameters = Parameters.getSpatialDefaultParameters();
    parameters.setParameterByKey(KEY.INPUT_DIMENSIONS, inputDimensions);
    parameters.setParameterByKey(KEY.COLUMN_DIMENSIONS, columnDimensions);
    parameters.setParameterByKey(KEY.POTENTIAL_RADIUS, inputSize);
    parameters.setParameterByKey(KEY.GLOBAL_INHIBITIONS, true);
    parameters.setParameterByKey(KEY.NUM_ACTIVE_COLUMNS_PER_INH_AREA, 0.02*columnNumber);
    parameters.setParameterByKey(KEY.SYN_PERM_ACTIVE_INC, 0.01);
    parameters.setParameterByKey(KEY.SYN_PERM_TRIM_THRESHOLD, 0.005);

    sp = new SpatialPooler();
    mem = new Connections();
    parameters.apply(mem);
    sp.init(mem);
}

And here is the usage:

public static void main(String args[]) {
    HelloSP example = new HelloSP(new int[]{32, 32}, new int[]{64, 64});
    
    // Lesson 1
    System.out.println("\n \nFollowing columns represent the SDR");
    System.out.println("Different set of columns each time since we randomize the input");
    System.out.println("Lesson - different input vectors give different SDRs\n\n");
    
    //Trying random vectors
    for (int i = 0; i < 3; i++) {
        example.createInput();
        example.run();
    }
    
    //Lesson 2
    System.out.println("\n\nIdentical SDRs because we give identical inputs");
    System.out.println("Lesson - identical inputs give identical SDRs\n\n");

    for (int i = 0; i < 75; i++) System.out.print("-");
    System.out.print("Using identical input vectors");
    for (int i = 0; i < 75; i++) System.out.print("-");
    System.out.println();

    //Trying identical vectors
    for (int i = 0; i < 2; i++) {
      example.run();
    }
    
    // Lesson 3
    System.out.println("\n\nNow we are changing the input vector slightly.");
    System.out.println("We change a small percentage of 1s to 0s and 0s to 1s.");
    System.out.println("The resulting SDRs are similar, but not identical to the original SDR");
    System.out.println("Lesson - Similar input vectors give similar SDRs\n\n");

    // Adding 10% noise to the input vector
    // Notice how the output SDR hardly changes at all
    for (int i = 0; i < 75; i++) System.out.print("-");
    System.out.print("After adding 10% noise to the input vector");
    for (int i = 0; i < 75; i++) System.out.print("-");
    example.addNoise(0.1);
    example.run();

    // Adding another 20% noise to the already modified input vector
    // The output SDR should differ considerably from that of the previous output
    for (int i = 0; i < 75; i++) System.out.print("-");
    System.out.print("After adding another 20% noise to the input vector");
    for (int i = 0; i < 75; i++) System.out.print("-");
    example.addNoise(0.2);
    example.run();
}

Running this example code prints out the resulting SDRs to the console like this:

Now we are changing the input vector slightly.
We change a small percentage of 1s to 0s and 0s to 1s.
The resulting SDRs are similar, but not identical to the original SDR
Lesson - Similar input vectors give similar SDRs


---------------------------------------------------------------------------After adding 10% noise to the input vector-----------------------------------------------------------------------------------------------------------------------------------------------------------Computing the SDR----------------------------------------------------------------------
[63, 197, 286, 360, 400, 517, 518, 559, 561, 587, 590, 611, 619, 645, 704, 811, 1022, 1065, 1184, 1407, 1461, 1554, 1574, 1652, 1686, 1704, 1765, 1772, 1849, 1871, 1945, 2090, 2125, 2159, 2203, 2213, 2233, 2288, 2358, 2367, 2415, 2434, 2462, 2599, 2609, 2617, 2755, 2862, 2889, 2938, 2967, 2976, 2995, 3010, 3018, 3057, 3104, 3126, 3226, 3341, 3370, 3373, 3394, 3398, 3399, 3479, 3484, 3540, 3637, 3662, 3669, 3712, 3754, 3817, 3875, 3915, 3941, 3977, 3989, 4034, 4082]
---------------------------------------------------------------------------After adding another 20% noise to the input vector-----------------------------------------------------------------------------------------------------------------------------------------------------------Computing the SDR----------------------------------------------------------------------
[63, 197, 286, 310, 360, 400, 418, 517, 518, 559, 561, 587, 611, 619, 704, 811, 1022, 1065, 1184, 1248, 1461, 1485, 1552, 1554, 1574, 1611, 1652, 1669, 1686, 1704, 1772, 1849, 2090, 2125, 2159, 2203, 2213, 2233, 2367, 2415, 2434, 2462, 2545, 2599, 2609, 2617, 2755, 2846, 2862, 2889, 2938, 2967, 2976, 2995, 3008, 3010, 3018, 3057, 3104, 3106, 3126, 3226, 3264, 3341, 3370, 3394, 3399, 3479, 3484, 3540, 3637, 3664, 3669, 3712, 3875, 3915, 3959, 3977, 3989, 4034, 4082]

This is an outstanding milestone for Numenta, NuPIC and the NuPIC community because of all the advantages the Java language brings with it. It shows that the NuPIC community contains a full and vibrant user base that is very committed to the success of NuPIC and HTM technologies. The development of the Java version by the NuPIC community also validates the choice of making NuPIC open source, showing that its community desires a fully compliant version of NuPIC that is easy to manage, install and widely applicable.

Want to get involved?

Are you a Java programmer interested in neocortically-inspired machine intelligence? Check out the htm.java road map and find out where we need help. Create some sample applications and get your feet wet with HTM on the JVM.


On a personal note, I’d like to give a big thank you to David Ray. He came to us earlier this year with a plan for this Java port and the full intention of giving the codebase over to Numenta for management. Over the past several months, he has worked tirelessly reading the NuPIC codebase and painstakingly creating Java versions of all our algorithms. Congratulations to David for reaching this milestone and creating a complete port of NuPIC.

Matt Taylor
Open Source Community Flag-Bearer
Numenta, Inc.

Comments on Reddit


Top