NuPIC  0.2.7.dev0
Numenta Platform for Intelligent Computing
 All Classes Namespaces Files Functions Variables Pages
Public Member Functions | List of all members
ScalarEncoder Class Reference

A scalar encoder encodes a numeric (floating point) value into an array of bits. More...

Inheritance diagram for ScalarEncoder:
Encoder AdaptiveScalarEncoder DeltaEncoder

Public Member Functions

def __init__
 w – number of bits to set in output minval – minimum input value maxval – maximum input value (input is strictly less if periodic == True) More...
def getDecoderOutputFieldTypes
 [Encoder class virtual method override]
def getBucketIndices
 See method description in
def encodeIntoArray
 See method description in
def decode
 See the function description in
def getBucketValues
 See the function description in
def getBucketInfo
 See the function description in
def topDownCompute
 See the function description in
def closenessScores
 See the function description in
- Public Member Functions inherited from Encoder
def getWidth
 Should return the output width, in bits. More...
def encodeIntoArray
 Encodes inputData and puts the encoded value into the numpy output array, which is a 1-D array of length returned by getWidth(). More...
def setLearning
 Set whether learning is enabled. More...
def setFieldStats
 This method is called by the model to set the statistics like min and max for the underlying encoders if this information is available. More...
def encode
 Convenience wrapper for encodeIntoArray. More...
def getScalarNames
 Return the field names for each of the scalar values returned by getScalars. More...
def getDecoderOutputFieldTypes
 Returns a sequence of field types corresponding to the elements in the decoded output field array. More...
def setStateLock
 Setting this to true freezes the state of the encoder This is separate from the learning state which affects changing parameters. More...
def getEncoderList
def getScalars
 Returns a numpy array containing the sub-field scalar value(s) for each sub-field of the inputData. More...
def getEncodedValues
 Returns the input in the same format as is returned by topDownCompute(). More...
def getBucketIndices
 Returns an array containing the sub-field bucket indices for each sub-field of the inputData. More...
def scalarsToStr
 Return a pretty print string representing the return values from getScalars and getScalarNames(). More...
def getDescription
 This returns a list of tuples, each containing (name, offset). More...
def getFieldDescription
 Return the offset and length of a given field within the encoded output. More...
def encodedBitDescription
 Return a description of the given bit in the encoded output. More...
def pprintHeader
 Pretty-print a header that labels the sub-fields of the encoded output. More...
def pprint
 Pretty-print the encoded output using ascii art. More...
def decode
 Takes an encoded output and does its best to work backwards and generate the input that would have generated it. More...
def decodedToStr
 Return a pretty print string representing the return value from decode().
def getBucketValues
 Returns a list of items, one for each bucket defined by this encoder. More...
def getBucketInfo
 Returns a list of EncoderResult namedtuples describing the inputs for each sub-field that correspond to the bucket indices passed in 'buckets'. More...
def topDownCompute
 Returns a list of EncoderResult namedtuples describing the top-down best guess inputs for each sub-field given the encoded output. More...
def closenessScores
 Compute closeness scores between the expected scalar value(s) and actual scalar value(s). More...
def getDisplayWidth
 Calculate width of display for bits plus blanks between fields. More...
def formatBits
 Copy one array to another, inserting blanks between fields (for display) If leftpad is one, then there is a dummy value at element 0 of the arrays, and we should start our counting from 1 rather than 0. More...

Detailed Description

A scalar encoder encodes a numeric (floating point) value into an array of bits.

The output is 0's except for a contiguous block of 1's. The location of this contiguous block varies continuously with the input value.

The encoding is linear. If you want a nonlinear encoding, just transform the scalar (e.g. by applying a logarithm function) before encoding. It is not recommended to bin the data as a pre-processing step, e.g. "1" = $0 - $.20, "2" = $.21-$0.80, "3" = $.81-$1.20, etc. as this removes a lot of information and prevents nearby values from overlapping in the output. Instead, use a continuous transformation that scales the data (a piecewise transformation is fine).


w – The number of bits that are set to encode a single value - the "width" of the output signal restriction: w must be odd to avoid centering problems.

minval – The minimum value of the input signal.

maxval – The upper bound of the input signal

periodic – If true, then the input value "wraps around" such that minval = maxval For a periodic value, the input must be strictly less than maxval, otherwise maxval is a true upper bound.

There are three mutually exclusive parameters that determine the overall size of of the output. Only one of these should be specifed to the constructor:

n – The number of bits in the output. Must be greater than or equal to w radius – Two inputs separated by more than the radius have non-overlapping representations. Two inputs separated by less than the radius will in general overlap in at least some of their bits. You can think of this as the radius of the input. resolution – Two inputs separated by greater than, or equal to the resolution are guaranteed to have different representations.

Note: radius and resolution are specified w.r.t the input, not output. w is specified w.r.t. the output.

Example: day of week. w = 3 Minval = 1 (Monday) Maxval = 8 (Monday) periodic = true n = 14 [equivalently: radius = 1.5 or resolution = 0.5]

The following values would encode midnight – the start of the day monday (1) -> 11000000000001 tuesday(2) -> 01110000000000 wednesday(3) -> 00011100000000 ... sunday (7) -> 10000000000011

Since the resolution is 12 hours, we can also encode noon, as monday noon -> 11100000000000 monday midnt-> 01110000000000 tuesday noon -> 00111000000000 etc.

It may not be natural to specify "n", especially with non-periodic data. For example, consider encoding an input with a range of 1-10 (inclusive) using an output width of 5. If you specify resolution = 1, this means that inputs of 1 and 2 have different outputs, though they overlap, but 1 and 1.5 might not have different outputs. This leads to a 14-bit representation like this:

1 -> 11111000000000 (14 bits total) 2 -> 01111100000000 ... 10-> 00000000011111 [resolution = 1; n=14; radius = 5]

You could specify resolution = 0.5, which gives 1 -> 11111000... (22 bits total) 1.5 -> 011111..... 2.0 -> 0011111.... [resolution = 0.5; n=22; radius=2.5]

You could specify radius = 1, which gives 1 -> 111110000000.... (50 bits total) 2 -> 000001111100.... 3 -> 000000000011111... ... 10 -> .....000011111 [radius = 1; resolution = 0.2; n=50]

An N/M encoding can also be used to encode a binary value, where we want more than one bit to represent each state. For example, we could have: w = 5, minval = 0, maxval = 1, radius = 1 (which is equivalent to n=10) 0 -> 1111100000 1 -> 0000011111

Implementation details:

range = maxval - minval h = (w-1)/2 (half-width) resolution = radius / w n = w * range/radius (periodic) n = w * range/radius + 2 * h (non-periodic)

Constructor & Destructor Documentation

def __init__ (   self,
  periodic = False,
  n = 0,
  radius = DEFAULT_RADIUS,
  resolution = DEFAULT_RESOLUTION,
  name = None,
  verbosity = 0,
  clipInput = False,
  forced = False 

w – number of bits to set in output minval – minimum input value maxval – maximum input value (input is strictly less if periodic == True)

Exactly one of n, radius, resolution must be set. "0" is a special value that means "not set".

n – number of bits in the representation (must be > w) radius – inputs separated by more than, or equal to this distance will have non-overlapping representations resolution – inputs separated by more than, or equal to this distance will have different representations

name – an optional string which will become part of the description

clipInput – if true, non-periodic inputs smaller than minval or greater than maxval will be clipped to minval/maxval

forced – if true, skip some safety checks (for compatibility reasons), default false

See class documentation for more information.

The documentation for this class was generated from the following file: