Adaptive flash
care management & DSP IP in SSDs
What is it? Who does
it? and why?
by Zsolt Kerekes,
editor - June 19, 2012 |
The phrases "adaptive writes", "DSP
IP in flash SSD" and "adaptive flash cell care" have appeared
at various times in
past SSD
news stories, interviews and comments.
And while I sketched out as much
explanation as was needed for each story at the time - I did promise I would
eventually publish a list of SSD companies who use what I loosely called back
then "adaptive DSP technologies in SSD IP" in their new designs along
with a technology and market guide.
This article is it.
A
year ago there were only 4 or 5 companies doing this kind of thing - but I
could see this technology trend was creeping into double digits of companies
and when that happens in the SSD market you know that something significant is
going on.
Why are these new adaptive technologies important?
They
change the rules about what SSD designers can do with cheap consumer grade MLC
and TLC (x3) flash.
As I said in an
earlier article -
they reset the commonly held assumptions about the limitations of
endurance -
but that's just a small part of the new effect.
Here's a summary of
what the new technologies enable in different segments of the SSD market
- industrial SSDs
- lower price, and greater density - because MLC can replace SLC in most
applications
- enterprise SSDs
- changes the economics in fast-enough SSDs - because consumer grade MLC in the
new SSDs can last as long as more expensive eMLC.
But another
byproduct of the new technologies is that SSDs which use it can also have
significantly faster write cycles (greater
IOPS) - and
operate at lower power Each of these bullet points represents a
potential competitive advantage for the early adopters of these new technologies
in whichever SSD segment they are applied. But the bag of magic tricks can be
used to provide different characteristics for the different markets.
Currently
only a small percentage of SSD makers deploy these technologies. In fact each of
the many companies I spoke to about this in recent months believed they were
just one of a handful of companies doing anything similar.
I told
them how wrong they were and that they might be surprised when I published this
article.
Driving the need to develop these extraordinarily complex SSD
technologies is the certain knowledge and fear that traditional SSD controller
IP will fail to deliver working SSDs with future shrinks of flash geometry.
That means a successful SSD company in generation X flash may wake up
to be told by its designers that its SSDs using generation Z flash - can't be
made to operate at all - let alone last for 1, 2 or 5 years. (Time for the
company's VPs to create / update their bios on Linkedin.)
On the one
hand - companies which have already got the new SSD magic wands - such as
STEC - told me over a
year ago they are confident that they may see an upturn in their businesses when
their competitors (without adaptive designs) start to fall off the new flash
technology cliff.
On the other hand - with the
SSD market share
prizes getting bigger - it's more likely that enterprising SSD IP companies
will step in to sell their own maps of safe ways around the cliff - and that's
the business plan of DensBits.
But
I'm running ahead of myself now by mentioning companies.
Before I get
onto that - it's worth asking - What's at the core of the new SSD adaptive
technologies?
The simple answer has 3 parts
- adapting the write pulse energy in the flash memory to be as low as can be
- while at the same time providing usable
data integrity
with the attached ECC / DSP technology
- designing a set of ECC / DSP technology which can provide usable data
integrity.
Unlike traditional SSD designs - the ECC/ DSP strength and
actual choice of algorithms can be adapted to suit the flash memory according
the circumstances.
That means the same memory block may have different
ECC codes wrapped around it at different times in its operating life - depending
how healthy it looks. And different ECC codes may be used within the same memory
chip at the same time.
- Other techniques include adapting the spatial distribution of data - and
making the data striping plans more flexible than traditional designs.
None
of these individual design philosophies is entirely new.
- Systems designers like me working with the first generation of flash chips
in the early 1980s could see clearly that some locations were harder to program
than others - because the write pulses were an external user designed circuit
and software algorithm.
In later generations of flash - the
chipmakers embedded the write pulse circuits inside the memory chips to make it
easier for digital designers to use flash - and to reduce the risk of systems
designers over cooking the flash. So the awareness of this parameter may have
gone away for systems designers - but it was always an important part of the
memory chip design.
- Adapting to the reliability population curve of flash memory too - with
different controller technologies like wear-leveling and bad block management
goes back to the early 1990s and a company called
M-Systems.
- Spatial scattering of data across multiple memory chips in SSDs with
RAID goes back to the
1990s too and a company called
Solid Data Systems.
Variability in this parameter (variable size stripes) has been used in 2
generations of SSD designs already by
Texas Memory Systems. What
is new about the new adaptive SSD flash care management IP is that instead
of each of these parameters and design rules being fixed at the time of
manufacturing the SSD according to an idea of what works best for the
population of flash chips in this current generation - as with traditional
controller designs - the new adaptive SSDs have smarter technologies which
can each dynamically interact and learn from the chips they're connected to.
You don't have to understand the internal details of how these
individual techniques work.
And with hundreds of patents already
pending in this topic there's a high probability that the SSD vendor won't give
you the details anyway (not even under NDA) and even if you are yourself among
the rare set of people on the planet with the
educational
background to understand them.
It's enough to get the general idea.
Now
as we're talking about the SSD market here - you don't really expect that
anything - will stay clear cut for long. And so you won't be surprised to know
that it isn't.
Earlier this year for example -
Smart Storage
launched SSDs which used the knowledge learned from tweaking its adaptive SSD
controllers and then reapplied this back as a set of fixed paramters to
precondition flash chips so they would run better with traditional unmodified
controllers from LSI/SandForce.
You can think of this as presetting the write pulse parameters in the memory
chips with a better set of magic numbers than even the memory chip makers or
SandForce themselves would have come up with on their own.
When that product was launched in
April 2012 - I
said ""SMART's trick with the SandForce controllers is like using
Dolby correction with a 1980s cassette tape. Whereas SMART's trick with its
Optimus controller is like having a built-in dynamic sound equalizer."
I
doubt if that's the last we'll hear about
hybridizing
some of the IP knowledge acquired from developing these newer technologies and
then reapplying them back into earlier designs to stretch their market life.
I
promised you some kind of list of companies who are using these new adaptive
technologies inside their SSD designs - so here goes.
The list below
is my first draft - and I'll expand it next month with more detail when this
article moves off the home page and gets its own permalink.
In the
meantime - SSD companies which aren't mentioned below - but who think they
should be - can contact me with details of supporting evidence about what
they're doing in this area. (I know there are some companies in stealth mode
- both the deliberate and accidental kind.)
Anobit (acquired by Apple)
DensBits (chosen by
Seagate)
InnoDisk
Link_A_Media Devices
(acquired by SK Hynix)
LSI
Memoright
Micron
OCZ (Indilinx)
PMC
Proton Digital Systems
SanDisk
Skyera
SMART
STEC
XLC Disk
My
preliminary list above and the expanded list later is only going to include
companies which have developed their own SSD controllers which use these new
adaptive techniques - and not companies which
simply license IP
from a DSP IP controller company.
For more (serious) articles on this
theme take a look at the
SSD reliability papers,
SSD controllers page -
and for a lighter and more whimsical view of this aspect of the SSD market
today you might want to see
how to choose a flash
health care scheme to make your SSD last longer . | | |
| .... |
even more stuff about
adaptive R/W and DSP IP in SSD technology
Editor:- June 28, 2012 -
after publishing the article above - I realized that some of you would want to
know a lot more about the subject than I know or have time to write about
about.
I knew that out of all the
specialist SSD market
analysts - the person who knows most about this aspect of the market and
technology was likely to be Gregory
Wong founder of
Forward Insights.
So
I asked Greg - what did he think about my article? And could he tell me which of
his many SSD market reports is closest to this theme? and how much does it
cost?
Gregory Wong said - "I think you did a pretty good job
outlining these technologies. I believe the companies are mainly talking about
advanced ECC and flash signal processing."
Gregory gave me the
following useful add-on to this article.
For advanced ECC and flash
signal processing, you need 3 things: NAND statistics, soft information and
advanced decoding.
NAND statistics collection constructs the history
of the NAND flash memory cell characteristics and facilitates the estimation of
the reliability of each bit. To obtain soft information from the memory cell,
extra read commands or test mode sequences are required. These commands are
proprietary to the NAND flash manufacturer and a vendor implementing DSP would
require the NAND flash manufacturer to provide these commands.
Needless
to say, not all controller/SSD vendors will obtain this support.
Advanced decoding schemes employing soft decoding use the NAND
statistics and soft information to determine the most probable read signal
that corresponds to the actual stored data.
This allows you to
obtain readable data even when the memory cell is severely degraded or there is
a lot of 'noise' in cell data.
That is why you see companies like
Anobit and Densbits claiming a 10x improvement in endurance. STEC and Smart
Storage also claim to have similar technology.
Editor:- Greg said that his report -
ECC and Signal
Processing Technology for SSDs and Multi-bit per cell NAND Flash Memories 2nd
Edition - which costs $6,500 - has been his best selling SSD report. | |
| . |
| Since first publishing this article - as a blog
on the home page of StorageSearch.com
- the increasing importance of this topic has been reinforced by its many
appearances as a strategic thread in these later articles.
|
. |
| "One petabyte of
enterprise SSD could replace 10 to 50 petabytes of raw HDD storage in the
enterprise - and still run all the apps faster and at lower cost." |
| meet Ken and the SSD
event horizon | | |
. |
 |
| . |
|

|
| |
.... |
 |
| As every SSDmouse knows -
measuring stuff and adapting to what you know gives you safer operating speed,
better reliability and lower TCO. | |
..... |
summary
In
the future - all nand flash SSDs will have to use adaptive R/W and DSP ECC IP
technologies in their
controller schemes in
order to be able to use newer generations of denser flash memory. Among other
things these adaptive R/W techniques can magnify
reliability and
performance while
improving SSD design
efficiency and reducing
cost.
As
we go through the transition
years - all the safe assumptions which you thought you knew about flash SSDs
and suppliers will change (again). | |
..... |
 |
..... |
| "The variability of
the LDPC decode time is a function of how many iterations it takes to decode the
data from the flash and can be upto 20 microseconds." |
| the latency
implications of DSP ECC (May 15, 2014) | | |
..... |
| 4 years ago -
in -
SSD market
history |
Anobit samples 1st Memory Signal Processing
flash SSDs
Editor:- June 15, 2010 - Anobit
announced
it is sampling SSDs based on its patented Memory Signal Processing technology
which provide
20x improvement in operational life for MLC SSDs in high IOPS server
environments.
Based on proprietary algorithms that compensate for the
physical limitations of NAND flash, Anobit's technology (a variation of
adaptive R/W
and DSP ECC) extends standard MLC endurance from approximately 3K
read/write cycles to over 50K cycles - to make MLC technology suitable for
high-duty cycle applications.
This guarantees drive
write endurance
of 10 full disk writes per
day, for 5 years. | | |
. |
 |
| Above - Erase Pulse Control -
NAND
Reliability Improvement with Controller Assisted Algorithms in SSD (pdf) -
a paper by SK hynix
at the Flash memory Summit
(August 2013) |
. |
| "How long before we
get to clinical trials?" |
| ...from -
flash care schemes -
will Brand X flash care make your SSD live longer? (Brand Y has better tv ads.) | | |
. |
|
|
. |
How adaptive is the SSD
behavior to changes within itself?
All SSDs rely on processing data
about the quality of the memory as part of their normal data integrity
operations.
They wouldn't work without it.
But some companies have SSD IP sets in which knowledge about different
parts of the SSD can be optimized and fed back to control and enhance SSD
functionality over and beyond the standard accepted SSD function block
boundaries.
The degree to which this passing of the intelligence
(regarding the state of past and future anticipatable data flows, priorities
of the application and the flash array's own readiness and healthiness
condition) can impact behavior in other parts of the SSD - is what I call
adaptive intelligence flow symmetry. |
| 11 Key Symmetries in
SSD design | | |
| . |
| LSI says it pays to get a
2nd opinion from LDPC |
Editor:- August 13, 2013 - in a presentation
today at at the Flash Memory
Summit -
the
Nibbles and Bits of SSD Data Integrity (pdf) - LSI explained why
reserving the use of LDPC to deal mostly with read error retries (and also
later in the operating life of flash cells) can be a pragmatic design choice.
And
instead of applying different strengths of
ECC for fixed
physical block sizes - the company says another approach is to have variable
sized virtual blocks - which effectively means that better cells carry lower ECC
overhead.
...Later:- In November 2013 -
LSI began sampling the
SF3700
SSD controller (pdf) - which included elements of adaptive DSP in its design
as well as the unique ability to be configured as either a
small
architecture or large architecture controller. | | |
. |
|
|
. |
|
|
. |
 |
. |
| How big was the
thinking in this SSD's design? |
Does size really does matter in SSD
design?
By that I mean how big was the mental map? - not how many
inches wide is the SSD.
The novel and the short story both have their
place in literature and the pages look exactly the same. But you know from
experience which works best in different situations and why.
When
it comes to SSDs - Big versus Small SSD architecture - is something which was
in the designer's mind. Even if they didn't think about it that way at the time.
|
 |
For designers, integrators,
end users and investors alike - understanding what follows from these simple
choices predicts a lot of important consequences. ...read the article | | | |
. |
|
|
. |
 |
| . |
|
|
. |
 |
. |
| Surviving SSD
sudden power loss |
Why should you care
what happens in an SSD when the power goes down?
This article will
help you understand why some SSDs which (work perfectly well in one type of
application) might fail in others... even when the changes in the operational
environment appear to be negligible. |
|
| | |
. |
|
. |
| |