Isilon Storage TieringThe Isilon Storage Tiering (aka Smartpools) is a functionality that has been around for many years. It allows to send data to specific storage pools (a storage pool is a pool of nodes with the same node type or density). This allows to store data economically effective to the lowest price level. For example (see figure 1), you may create a policy that stores all new data onto a Pool1 that has been built out of some fast S210 nodes. Response times will be extremely good, but price point is also higher (faster disks, faster CPUs etc.). Then, you create another policy that says, move all data that has not been accessed for 30 days into Pool2. This Pool2 may contain X410 nodes. Much more capacity (36 SATA disks) but somewhat lower response times compared to Pool1. Further, you may have a third pool Pool3 that contains data that has not been touched for a year. This data is hosted on HD400 nodes (very dense, 59 SATA drives + 1 SSD per 4U chassis), but portentially lower response times than tier2. However, since this tier is only used for rarely accessed data, it would not impact the user experience significantly (may vary from use case to use case of course). The movement of data, according to policies will be done by the Job-Engine in OneFS. It happens in the background and the user is not impacted. The logical location of the files (the path) will not change. That means, that we could have a single directory that contains files that reside on three different storage technologies.
Figure 1: Policy Based Storage Tiering with Smartpools
Storage-Pools and Storage-TiersAs discussed above, a pool consist of nodes of similar types (nodes can differ slightly but that will not be discussed here. See storage pool compatibility rules in the manual ). If you have a very large cluster, you may decide to introduce tiers. A tier contains one or more storage pools. However, in many cases this is not required. The policies that you setup can be applied against pools or tiers.
Figure 2: One Filesystem across different Pools/Tiers
PoliciesThe policies that determine where data is stored are simple to setup via the WebUI, CLI or API. Your policies are not limited to file access times alone. You may also want to consider file type, location, ownership or any other file attribute that is available to create your policies (you may even decide to store arbitrary attributes for files in OneFS and use them in policies) . The GUI already contains some nice templatea which you can start with. For example, there is an Archive Template available that contains rules to move older data to older storage. Or an ExtraProtect template that protects files with a specific attribute value with a higher protection level (i.e. n+3 instead of n+2). The WebUI is quite intuitive (see the following screenshot).
Figure 3: Create Policy Wizzard
New in OneFS 8.0: CloudpoolsThe new and cool feature that comes along with OneFS 8.0 is a new pool of the Cloud type.
With the initial OneFS 8.0 release, which is available since February 2016, the following Cloud Storage (Object APIs) types are supported:
- Amazon S3
- Microsoft Azure
- EMC Elastic Cloud Storage
- EMC Isilon
Figure 4: One Filesystem extended to the Cloud with CloudPools
Secure Stub to CloudAssuming you have set up an appropriate policy that defines the requirements for those files that need to be moved to the cloud. For example, all files that meet all the following criteria :
- The files are larger than 5 MB AND
- The files reside in the directory /ifs/data/stefan AND
- The files have not been touched for 3 month
Figure 5: Job Types of the Job-Engine
Once the Smartpools Job is kicked off (i.e, every day at 22:00), it examines all directories/files in question (according to the policy) and if the configured criterias in the policy are met, the files will be moved to the cloud. Within the policy, you can also configure, whether the cloud data will be compresses and/or encrypted. The encryption keys are stored on Isilon which means, that no one could read your data in the Cloud, even if someone got access to your cloud storage account.
In the local filesystem a stub file will remain, that contains three things:
- The file meta data (like usual: creation time, last access time, size, …)
- A ‘link’ to the cloud data
- Eventually some of the original cached data.
Figure 6: Content of a stub file
As said, from the application or user point of view, we cannot see the difference of a normal file and one that is stubbed to the cloud (you can of course figure it out – but more on that in a later post).
Recalling files from a Cloud-Pool and Local CacheWhen a stubbed file is accessed, its content is retrieved from the cloud and cached locally (can be on SSD or HDD). However, it will *not* be stored permanently in the filesystem. If that would be the case, every regular user could fill up the local filesystem with very view commands. The files archived in the cloud could have multiple times the capacity of the local filesystem, therefore a permanent recall of files can only be performed by the administer or a user with the appropriate privileges. The behavior of the local cache can be modified in the CloudPools settings. For example, we can tell the cluster to
- Cache or not cache recalled data locally
- To use Cache Read Ahead mechanisms only for accessed data or full files
- Cache expiration time (a second to years)
- Writeback Frequency (how often will the cluster write modified local cached data out to the cloud)
Retention TimeThe retention time you can configure with each policy defines how long archive data in the cloud will remain after the local stub file hast been deleted. The default is one week. After that period, the relevant data in the cloud will be deleted. In addition, you can define
- A Backup retention period for NDMP incremental Backup and SyncIQ. That period defines how long data in the cloud will be kept that has been synchronized by SyncIQ to another location or that has been backed up with an incremental NDMP job. The default is 5 years. That means, if someone has deleted the local stub file, it ran be restored by an NDMP or SyncIQ job and the data can still be accessed for the period of time configured here.
- A Backup retention period for a full NDMP backup. Like the previous but for full NDMP backups.
SummaryWith CloudPools, the Scale-Out-NAS System Isilon has now a cool feature that allows transparent tiering to an external storage layer. Right now, two Cloud Providers and two external systems (Isilon, ECS) are supported. There might be more options going forward. The data movement is transparent to clients and secured through AES-256 encryption. By using Cloudpools, one can implement a fast and scalable multi-protocol system with fast response times that can grow almost limitless to cloud scale.
There are some more aspects to cover though. For example performance: What happens with stub files during backup and replication? What about disaster recovery, access to CloudPool data from different sites, step by step approach to set up CloudPools? Stay tuned, I will come back to these questions as my day-by-day job allows me to enter further stuff.
Upcoming Webcast on CloudpoolsI’ll discuss Cloudpools in a Webcast on April 12th, 2016. Feel free to register here to join the session: http://bit.ly/1PYTR0P
References Isilon OneFS 8.0 Documentation EMC Community Network:
 All actual and older Isilon and OneFS documentation can be found here:
 OneFS Technical Overview
 Official EMC Isilon Cloudpools site: