M8M

M8M icon and state colors
A minimalistic, hopefully educational cryptocurrency miner.

Those pages serve as an online help for users looking to configure or run the program. They were last updated at release of version 0.1.896 Beta

You can also find there information about the various concepts involved in development of M8M. Hopefully those things will be elaborated to the point of becoming a decent source explanations regarding the cryptocurrency mining process.
For more information about the design rationale and the history of the program, refer to project readme.

System requirements

Usage

Builds and releases

M8M is open source software. In every moment, you can just pull the source code and build your own executable.
Sometimes, new builds will be made available. This can happen to fix an important bug or to test a new set of features. Builds are identified by a single number <patch_num>.
Versions are somehow more important builds. They get tagged with an additional major and minor version. Those are usually provided in an installer named M8M-<major_ver>.<minor_ver>.<patch_num>.
In both cases, <patch_num> always uniquely identifies the contents somehow: higher numbers imply a more recent version, hopefully always better!
Consider builds (single number) as a work in progress (you might want to upgrade or not) while versions (three numbers) as a suggested upgrade.

The first public release was 249, appeared on gamedev.net 31 Jul 2014. Binary is not available on github but rather on google drive.
All others are on github.

FAQ

Can you elaborate on linearIntensity?

It controls the amount of hashes to be computed at each step of the computation. This is referred as hashCount. Differently from legacy miners, this is not the global work size to GPU, nor there is a direct mapping to the amount of concurrent work.
The goal is to control how much time GPU will spend on hashing at each step, which relates to system performance.
Legacy (OpenCL) miners use three methods of controlling global work size:
  1. intensity → global work size = 2intensity; adjustments might apply, depending on algos, you might be setting offsets. This is the oldest and most gross-grained method.
  2. xintensity ("experimental intensity" or shader-count based intensity) is something that scales with GPU power.
    1. In concept, it is global work size = SHADER_COUNT * xintensity, where SHADER_COUNT estimates "how big" your GPU is.
    2. At a certain point somebody decided this could just multiply a work size determined exponentially as above.
    Nonetheless, it's something multiplied by "gpu size". Behaviour is selected by algorithm.
    It has a few quirks: the word "shader" is not in CL parlance and the term has been used in different ways over the years. Nowadays, it seems to be equivalent to "core count" in marketing parlance... whatever legacy miners compute this as intended is debatable.
    In practice xintensity is a quite sensible setting, akin to the slider in the web wizard which eventually determines linearIntensity.
  3. rawIntensity → global work size = rawIntensity; it provides extreme amount of control, debatably useful in this specific context with an extreme chance of producing sub-optimal/invalid settings but it's rather clear in both concept and implementation.

None of the above works as linearIntensity. Incrementing linearIntensity by 1 increases the amount of hashes to be computed for each step by an algorithm-implementation-specific amount. The idea is the same linearIntensity produces more or less the same system responsiveness regardless of the algorithm or implementation being used.

The hashCount is therefore always linearIntensity multiplied by an algorithm implementation constant. It is implied algorithm implementation using multiple steps can go with an higher multiplier.

Once hashCount has been determined, global work size is hashCount multiplied by internal parallelism used by the specific kernel for the corresponding processing step. When a kernel employs parallel computing the work size is a 2D vector. As convention, the value hashCount is the 'y' while kernel parallelism degree is the 'x'. In other terms, parallel kernels employ multiple execution units to produce an hash.

Note linearIntensity by itself does not scale on hardware performance. It is a easily predictable behaviour.

As example, Qubit fivesteps (signature 69b38ac0d0b99f73) with linearIntensity=128 → hashCount=32768=32Ki. Produces five work dispatch calls:
  1. Luffa<1, 32Ki>(...) // "single threaded"
  2. CubeHash<2, 32Ki>(...) // 64Ki work items
  3. SHAVite3<1, 32Ki>(...)
  4. Echo<8, 32Ki>(...) // 256Ki WI → 4096 64-way wavefronts on AMD GCN in "teams" of 8
  5. SIMD<16, 32Ki>(...)

What about dynamic intensity?

There's a plan to add it but it's far from high priority.

How can I tune <this parameter>?

It's likely you cannot. M8M is a completely new, different code base. The parameter you're looking for likely does not exist.

In theory, algorithm implementations are free to define any amount of specific parameters. In practice, at the time this is being written (last commit to master branch) the only parameter is linearIntensity. It is common to all algorithm implementations.

Are you going to release "power-optimized", low temperature kernels?

No. Low temperature kernels typically perform poorly on a performance/watt metric. This is because there are fixed costs to pay to keep the system in performance mode. Even if modern GPUs are a bit smarter it is usually best to run them at peak performance for shorter amount of time.

As in this context there's no "shorter amount of time", if temperature is a concern to you, lower clocks.

It runs slower than usual miners!

That's unfortunate but there's probably very little I can do as I have no high-end hardware to test.

Note statements like this give me no useful info. Provide information at least on:
  1. Number of video cards, model, clocks for every one;
  2. Any modification including custom BIOS;
  3. Driver model;
  4. OS;

If you're looking for high performance, you might be better off with the "legacy" miner applications. The main goal is to be non-invasive to workflow and keep the system responsive, slight performance losses are considered acceptable. This expectation originated from Qubit performance which turned out to be quite faster. It is somewhat ironic that Qubit received little to no optimizations at all.

As a side note all kernels run at least just as fast on all the test hardware I have. That's a Feb-2014 Radeon 7750 I bought for 99 bucks and no, I don't plan to upgrade, much less to go multi-gpu.

Are you going to work on <that algorithm>?

Maybe. You'll have max chances if you ask immediately after a release is published. Typically development stops a while until I figure what to do next.

Are you going to release kernel for <some card>?

Unlikely, especially if you look for performance. Working with remote hardware is possible but I'm not all that eager to experience the thrill.

How can I solo mine?

By using another mining application. Support for solo mining has been left out on purpose as it is my opinion this is better carried out by setting up a dedicated stratum server. I have no interest in solo mining anyway.

Are you going to support <an old driver version>?

No. In particular, there will be no active back-ports, never. M8M follows the same practice as games: updated drivers. In theory you should always have the latest on your system. In practice it could be a couple of versions late.

Some releases (regardless they get triple-version or not) will be tested with a certain driver. If you're not running that you're basically on your own.

How can I avoid using a particular card? What about card-specific settings?

As of last commit to master branch this is not yet possible however, some steps have been made in the direction.
The current plan involves assigning a set of requirements to each configuration the devices will be tested against those configuration requirements and a list of configurations will be produced for each device. The first configuration satisfying requirements will be used.
Note this is considerably different from legacy miners. Due to various quirks in the programming interfaces there's no reliable and widespread method of mapping devices coherently and persistently. Therefore, I will side-step the problem completely.
This of course doesn't change the fact having a single card I have very little interest in providing this functionality even though I already deployed hooks to support that.

If I donate <x> of some crypto, will you do <task-y>?

Probably not. As a start, this is not what 'donation' means. When you donate something, you give away something expecting nothing back.
When you give out something hoping to get something else back you're not donating. In this case what you're doing is to pay me or, in some cases, "offering a bounty".
Most likely I won't be interested in the offer, you can still try if you know what are you talking about.

In particular, I won't develop a higher-perf kernel for .1BTC.

If I pay <x> of some crypto, will you give me the optimized kernels?

Nope. Because of reasons I don't want to explain here, I have interest in keeping them open source. You will get them for free when everyone else does. Don't worry, this means you'll easily get >4 months of benefits.
If you're referring to the faster Qubit kernel you'll be disappointed to know the thing does not currently exist - I only speculated a realistic upper bound of improvement, assuming no other bottleneck is hit.

Algorithms

Hash functions and primitives

Developers and technical

Websocket protocol documentation.

Extensions

English presentation stringTokenSupported sinceSuperceded by
Long-term statistics MDZ_long_term_stats Proposal N/A
Temperature monitoring MDZ_temperature Proposal N/A
Fan speed monitoring MDZ_fan_speed_monitor Proposal N/A
Clock rate monitoring MDZ_clock_monitor Proposal N/A
Device load monitoring MDZ_device_load Proposal N/A