Holy cloud; It is raining render farms

in #technology6 years ago (edited)

Six short years ago, I visited one of my customers running an architect firm. It was a small SMB shop that had a bunch of big real estate developers as clients. Their prized possession was a 5 node network of Intel Xeon Quad Core 3.2 GHz dual CPU Servers, 16GB of memory, NVIDIA GPUs and a terabyte of internal disk. An external 10TB RAID storage array was used to centralize the task queue. These machines were placed in a small alcove fitted with a custom air conditioner.

On that particular day they were producing a video that had a running time of 15 minutes with frame rates at 24 fps.
The general gloom in the office hit me like a hammer as soon as I walked in and my first thought was “I hope it has nothing to do with my product!”

Having reassured myself that I was not Persona Nona Grata, I figured out that one of their architects had committed the grave sin of messing up on the placement of a single light and since most of the scenes had been copied over from their predecessor, the lighting blooper had cast a dark shadow on more than 75% of the production.

No amount of stiff upper lip was going to change the fact that they had to render more than 16000 frames all over again. They decided to deliver a lower resolution video to avoid the wrath of their client but the upshot was that it would take them another 23 days to get this over with!

I have to confess that I did think they were nuts until I sat down and started computing their needs. Eventually after a completely unnecessary re-analysis of their misfortune, I came to the sad conclusion that there was no solution in the world that would help them achieve their goals economically. The few render farms that existed at the time were expensive, poor on support and had major interoperability issues with most major rendering software versions.

Fast forward to today and the situation is vehemently more complex:

  • Mobile markets have mushroomed creating a number of different resolutions that need to be supported
  • GPU’s have taken rendering to new heights but the disadvantage is that the scene complexity has increased as well. Don’t believe me? Check out Yutapon cubes and the number of tiles it forces you to generate
  • Decreasing render time demands expertise in understanding hardware capabilities and adjusting settings such as tile sizes, encoding formats, resolutions…
  • Most animations use multiple layers that will force prolonged compositing and hence lead to slower final render times
  • Virtual reality is infecting product/solution advertisements and incorporating real time feedback from devices and sensors
  • Popularity of 3D rendering has doubled the number of required frames
  • Photoreal renderings are all the craze

The rendering market is exploding and here is an excerpt:

The global visualization & 3D rendering market size is projected to reach USD 5.63 billion by 2025, according to a study conducted by Grand View Research, Inc., progressing at a CAGR of 22.5% during the forecast period

Render farms are entering a golden age because of VR and AR



3D animations have become the name of the game for all industry verticals as well as use cases.

Healthcare industry is investing heavily in producing 3D animations for patient advocacy, device marketing, staff training and practitioner reach out
Manufacturing is investing in VR solutions for floor management, monitoring, design and automation
Automobile makers are producing real time VR/AR animations creating immersive driving experiences in the showroom
Companies across the spectrum are creating 3D animations for sales presentations since it works around the clock for them

Render farms are strewn across the landscape promising a wide variety of options. They charge you by the amount of CPU/GPU time that your scene consumes. They have integrations with a wide variety of graphics software like Maya, 3DS Max, Blender, etc. They run their rendering manager that will do distributed rendering across their farm. They promise zero infrastructure management and I believe this is a boon for individual graphics designers.

Before we go further, let us take a look at a basic rendering workflow:

It is quite clear that a lot of the intermediate renders will be done offline and only the final renders will be invariably pushed out to the farm.

Of course there are a few enterprise companies like Sony or Pixar that create virtual desktops in the cloud and loan it out to their developers. In this case the entire workload from start to finish exists within the rendering farm.

The render farm infrastructure is quite simple:

  1. They have a dedicated distributed queuing system called the rendering manager. each customer gets access to his/her jobs and customers can check the status using the provided URL
  2. Rendering nodes do the actual work and use either CPU (software) or GPU (hardware) based rendering
  3. Common file storage holds all scene files, assets, libraries and the output frame images
  4. Billing is carried out on the basis of node hours and storage capacity used. All native rendering software licenses are subsumed within the basic cost
  5. Some farms even allow custom post production processing on the frames by providing computing elements on which you can run your scripts

Why would a designer even consider a render farm?

Large production houses will always have their own infrastructure. E.g. Sony has a render farm with more than 100K cores at their disposal. Small companies which have very few projects on hand will render it in offline mode. But the sweet spot is always in the middle!

Let us consider a simple animated video that requires 24 fps, runs for 15 minutes, requires an Intel XEON CPU and is able to render each frame in 5 minutes. This means that our final render process is going to process 21600 frames.

Let us put this question to one of the render farms. I picked rebusfarm but you can pick any one.
Here is the basic calculator configuration:

Rebusfarm’s cost is given below:

The time spent on the job is:

The customer that I alluded to at the outset would have rendered this in 360 hours provided this is an exclusive operation. This is possible because he had already invested in 5 instances of the benchmark server that Rebusfarm used for its calculations. But the point is if you really have 15 days to hog tie your infrastructure then you don’t need an external render farm anyway. Remember that you can mimic the render farm’s hardware configuration but you cannot match its scale and efficiency.

Based on the use case, the render farm cost is very reasonable given that your invoice to the client will always have rendering charges as a separate billable line item. Suppose you had to do this animation in 3D, it would practically double the number of frames and it becomes impossible to handle such workloads in an on offline setup.

What is the approach that is most feasible?

Remember that using render farms is also a learning experience. You will have to package your assets, libraries and special licensed modules into an easily exchangeable format. You will have to use references to assets rather than cloning or duplicating them with each scene. You will have to break down each frame using tiles if rendering if each frame needs to be sped up.

You can always build your own farm if you have the inclination but let us ignore it because for most companies it is not a choice. We will also exclude collaborative free farms because that is no different than running a Bittorrent client and sharing your computing resources with the added loss of data privacy.

Fortunately the evolution of the cloud has given birth to a wider variety of options. If you Google render farms then you will get many hits and quite a few misses!

Render farms do what you want to do on premise a whole lot better and more economically. The benefits are obvious:

  • If time is your biggest obstacle then render farms will help you save it by throwing more nodes at it
  • Factor in the cost of setting up the hardware, hardware maintenance, centralized storage repository, software license management, cooling infrastructure and staff augmentation; you might be inclined to go with render farms
  • Setting up a DIY render farm may not be your core competence; why not absorb the cost and rely on an external provider?

If privacy is not a huge factor then blockchain based render farms will work for you

Blockchain is creating render farms by harnessing spare GPU/CPU time from all across the world. It is in essence a decentralized distributed rendering cloud. There are many candidates you can choose from like RenderToken or Golem. These two run on the Ethereum network and you have to purchase tokens from them to launch your job. Transaction fees are paid to the farm provider and eventually to the render nodes.

Your transactions will be publicly viewable since it is powered by a blockchain. In most cases only different tiles are distributed out to rendering nodes and it is difficult for them to piece your data together.

Since there is no dedicated hardware setup in a datacenter, the costs promise to be a whole lot cheaper. In addition, it is a market driven economy and past transactions will fuel future ones thus creating a race to the bottom. The downside is that support will be limited and span only the most popular rendering software. Since the quality of the hardware on the loaned render nodes cannot be guaranteed, render times could be impacted. Even though the platform promises accountability, it is still dependent on render nodes that it does not own.

Fully managed farms

source: http://us.gmocloud.com/oldblog/blog/2012/12/19/cloud-based-rendering-the-logical-next-step-for-render-farms/

These are dedicated farms created by experts who have themselves managed massive farms during their lives. They generally have the widest interoperability suite and extremely consistent hardware at their disposal. They have been doing this for the last decade and are highly focused on taking your job, distributing it across their resources and returning the results to you. It is a low touch service and actually perfect for companies that want to offload all their rendering responsibilities. There are many of them around: RenderNow, Garage Farm, RebusFarm...

Most of them have calculators that give you a quick appraisal of what it costs to run your rendering job on their farm and it is a invaluable asset for independent design firms to spell out their offerings.

In the short term they have better ROI over self managed hardware. In the long term, they ease the complexity of running your own farm. They also provide scalability and dependable terms of service. They are probably the most inexpensive solutions available today. It is not possible to customize post production or intermediate renders without knowing about internal deployments. Extensibility is generally not a hallmark of their offerings as it is a WYSIWYG model.

Cloud provider farms

source: https://lesterbanks.com/2013/04/your-own-personal-render-farm-zync-makes-cloud-rendering-as-easy-as-printing-locally/

Public cloud providers offer recipes that can be launched based on specific jobs. These recipes include pre-configured virtual machine images complete with licensed rendering software, private networks for hosting all the render nodes, shared storage, data encryption, job resource management nodes and monitoring nodes. These run on computing power that you have to instantiate using their management portals.

They generally have job scheduling plugins for popular software that will pick up the scene from your environment, render the frames and return the output back to you.

Take a look at Azure Batch Rendering, Zync on Google or Deadline on AWS.

Zync which is probably the cheapest offering here will render the job we discussed earlier for approximately $4000.

Their scaling is unmatched and can render complex jobs very easily. Load balancing and distribution across data centers gives you geographic immunity. Reliability and availability are the hallmarks of public cloud providers and if you have a sizeable job then this solution is for you. Additionally, you can use any part of the cloud pipeline to fully extend post production workflows. Spot resources can be bid upon to reduce computing overheads. They bring seamless handling of load spikes. Geographically distributed collaborative design and development is easy. Costs are going to be expensive because they generally run virtual machines. You need to be cloud savvy to monitor and fix jobs that are not running as scheduled.

DIY cloud farms

source: https://www.pipelinefx.com/9-tips-for-rendering-in-the-google-cloud/

Now you are the adventurous one! You want to build your own rendering farm in the cloud with a mixture of your own tools as well as orchestration workflows offered by the provider. You have large IT teams capable of computing costs, deploying resources and managing them 24/7. Then the cloud is a massive opportunity for you.

AWS data pipeline can be used to composite frames with minimal compute intervention. Google’s Appengine is a great framework for creating a dynamically scaling rendering network. Azure’s PipelineFX is a terrific render pipeline manager that makes it easy to link up with the rest of the cloud. Google’s FileStore is a made to order for building giant render farms. You can build entire movies from scratch in the public cloud and probably stream them out to a live audience if required.

These providers give you unlimited flexibility and massive scalability. It also makes great sense if your entire production workflow is already in the cloud. But these are not for the faint hearted. They require serious technical expertise and you need to understand your rendering application, cloud infrastructure and your needs. Licenses have to be purchased separately and monitoring mechanisms have to be carefully constructed.

Conclusion

Render farms are here to stay and eventually they will take over all rendering. I predict that the on-premise rendering farm will vanish gradually. All rendering will be done in the cloud as the surplus of older machines is fully utilized.

Blockchain farms are interesting but they will never appeal to the enterprise audience. However they could create a formidable niche by encouraging design studios to share resources and render amongst themselves.

Fully managed render farms will go the way of datacenter providers like Rackspace. They will ultimately not be able to compete with public cloud providers in terms of functionality and scalability. They will consolidate or be acquired for their expertise.

Real time rendering is a huge opportunity for on-premise hyper converged infrastructure providers. But I suspect that ease of management outweighs everything and edge computing (a la OpenStack) will win this battle.

Hope you enjoyed reading and your comments are always welcome

Sort:  

🚀 This is a stellar post! 🚀

I will be featuring it in my weekly #technology and #science curation post for the @minnowsupport project and the Tech Bloggers' Guild! The Tech Bloggers' Guild is a new group of Steem bloggers and content creators looking to improve the overall quality of our niche.

Wish not to be featured in the curation post this Friday? Please let me know. In the meantime, keep up the hard work, and I hope to see you at the Tech Bloggers' Guild!


If you have a free witness vote and like what I am doing for the Steem blockchain it would be an honor to have your vote for my witness server. Either click this SteemConnect link or head over to steemit.com/~witnesses and enter my username it the box at the bottom.

dear @jrswab, thanks for reading my article and showcasing it. appreciate your help

This was pretty detailed and useful.Am Glad I followed you on time for this. Being very basic at the concept but it could really help understand business and its marketing. Thank you for taking time out and writing this

thank you for your comment and i am glad you found it useful

These are the things that I understand less but still enjoy reading :)

Nice work on summing up the render farm landscape! Building on the information you presented, I'd like to make a few notes:

Blockchain-based solutions have a few challenges, both technical and operational. They usually rely on breaking frames into pieces, but not every frame can be rendered correctly this way. The nodes they use run on a variety of hardware and internet connections, which means the delivery speed can have ups and downs. And they are facing the challenge of licensing the 3D software, which means that at the moment they are either supporting only free software (Blender) or are tied to a specific vendor (as is Otoy's case)

Support is also a matter to consider. Cloud-based farms and public blockchain solutions offer a different kind of support to their customers than 'traditional' farms.

Getting started with using a farm doesn't have to be an elaborate process when it comes to preparing the project file. The processes are usually documented for each particular software. And farms can also offer one-click plugins to help with the packing and uploading.

In terms of infrastructure, there is now a new generation of hybrid solutions in the market. RenderStreet is one of the first render farms of this kind, combining both dedicated and cloud infrastructure for a better value for dollar and better availability. This translates into a more flexible service offering - from a unique low-cost monthly plan to a high-volume, cost-optimized plan dedicated to studios. There is also the benefit of a faster upgrade cycle for the hardware, all to the benefit of the artists and studios. And everything is packaged into an intuitive, automated interface.

(disclosure: I'm RenderStreet's CEO)