Hashing It Out
Hashing It Out

Episode 90 · 1 year ago

Hashing It Out #90- Core Scientific-Ian Ferreira

ABOUT THIS EPISODE

On this episode Dean and Corey have Guest Ian Ferreira of Core Scientific cover the usage of AI in the blockchain spaces and building efficient mining architecture.

Links: Core Scientific

The Bitcoin Podcast Network

Hey, what's up? So Avalanche, let's talk about it. What's an avalanche? Snow comes down real fast, fierce gains momentum. But I'm not talking about the natural disaster. Orif it's not really disaster, I guess it no one's around. But anyways, avalanche. What is it? You've heard about it, now you're gonnahear some more. It's an open source platform for launching decentralized finance applications.Right, defy, that's what you want. Developers who build on avalanche can easilycreate powerful, reliable, secure applications and custom block chain networks with complexrule sets, or build an existing private or public subnet. Right. Ithink what you should do right now is stop what you're doing, even ifit's listening to this podcast. Stop, pull over, go to the gasstation. If you need to go to a subway, there's a subway,like everywhere. There's always a subway, all right, all right, there'salways a kroger. Just stopping a parking lot somewhere. All right, goingto the dogs are adamant that you stop. Go to Alva, LAV ALVA LABSDOT org, to learn more. All right, stop, go toAlva labs. That's a VA labs, labs dot org. Now Entering TalkYea K. welcome to hashing it out, a podcast where we talked to thetech innovators behind blocked in infrastructure and decentralized networks. We dive into theweeds to get at why and how people build this technology the problems they facealong the way. Come listen and learn from the best in the business soyou can join their ranks. Welcome back to hash it out. Your hosts, Dr Corey Petty, with my co host, Dean Eigenman. Today's episodewe're going to be talking with core scientific. Recently had a podcast at them overon the Bitcoin podcast with a different person from a company, and wewanted to get a little more technical, so we brought over Inferrara from chiefthe chief product officer of course scientific. Ian Do the standard thing and kindof tell us what you do, where you're came from and how you jointhe space. Hey, good morning. Yes, I'm Infrara. I'm chiefproduct officer at course scientific. been with course scientific approaching two years now.Before that I had bounced around a couple of other machine learning startups and it'sbeen a decade at Microsoft working in this search team. So I've been aroundthe algorithms and big data distributed system space my entire career. Mostly what youdo now, like what kind of brought your core scientific and what do youdo there? So, course, scientific was interesting for a couple of reasons. One is I definitely wanted to focused on AI, so that was oneof my criterion. The second was it's a very different approach. So alot of companies, if you go and work in an AI roll, you'regoing to start from business problems downwards and kind of make your way to youknow, let's say tends or flow kind of that. You know, tendsto flow by torch layer up the stack. And what was unique about course,scientific is we were starting from the bottom, so we were starting fromconcrete power, TDP chipsets, you know, understanding and to connect said, reallyan opportunity to get the under the hood experience, if you will,be a hardware experience, if you will, of Ai and then work your wayup so that once you've done that, you have a full picture of everythingfrom okay, this is going to use this library, that's going totake advantage of this silicon feature that's going to be accelerated by this hardware infrastructureand blah, Blah Blah. So it just gave me a really unique opportunityto work from the bottom up. That's really interesting. So you're like becausefor those throt of core scientific is, I guess you boast yourselves as ainfrastructure of company. Provide a lot of resources for people to do a myriadof things that, for properate need compute power, part of that being miningto various cryptocurrencies, as well as like machine learning and so on and soforth, and Ai. It's interesting that you take it from these are theresources that we have, these are the architectures that may be fit to thesedifferent types of algorithms that are applied across the board. What does that like, even that perspective like? How do you how do you approach a problemwith what is that kind of what have you learned from that? Yeah,so, as you mentioned, course,...

...scientific provides hosting an infrastructure as wellas software services for the two primary categories. One is blockchain and the other AI. And the blockchain side were lowered down the stacks, you know,somewhere between data centers a service and infrastructure as a service, where we hostmining gear for customers. And then on the a side we much higher upthe stack. So we're pretty much a pass platform as a service, andwe'll talk about that some more it later, but those are the kind of thetwo differences. And then we have, you know, some synergies between thetwo. If you start at the facilities level. Now the common thingwith crypto and an ai gear is high heat, high power. So ourfacilities are typically much higher rated than you would find on a traditional data center. And the other aspects is around controlling heat and so making sure you candeal with these machines that don't fit in normal standard racks that you might beused to. So that's the infrastructure tier. And then we did a couple ofthings around using algorithms, around a workload placement. We do that bothin ai and on blockchain. So, as you can imagine, if youlook at on the blockchain side, a very common workload that you might wantto figure out what's the optimal coin to mine, right. So how doyou do that? You have to figure out a bunch of algorithms and agradients to make a decision what to mine, and we can we can talk alittle bit more about how that works and the same thing on the aeye side. You might want to run a large training job and you mightwant to know is this better to run an Azure? Is it better torun on our infrastructure core, is it better to run on aws? Becauseagain, you have the same equations. There's a cost and there's a computecapability and you have to figure out what's the optimal for the customers need.So that's kind of the software over lap that we have between the two verticals. That makes sense. You kind of just answered my or partially answered mynext question, which was why blockchain and AI? It seems like two veryabstract, different things. Why not focus on one? Why did you guysdecide to do both of them? was there that much overlap that it madesense or yeah, that's a great question. I think the company started, beforeCEO came and Turner joined, the company was primarily focused on blockchain infrastructureand I think he saw the opportunity to expand into ai because of that similarity. And you know, we've you know, AI business is much more nice andthen blockchain, but we've grown pretty substantially and, as I mentioned,we've gotten much higher up the stack. You know, the giving people thiscloud experience to gain access to this infrastructure. This is on the blockchain side,with more of an IASS or or data center as a service, wherewe managing the infrastructure for customers. Do you are there any use cases whereyou use let's say the one vertical in the other? For example? IsTheir use case for some of your blockchain or for some of your AI infrastructurein your blockchain product, or are they completely orthogonal? Now it's a greatquestion. Absolutely so, going back to the earlier you know workload placement optimization. In the case of Blockchain, the workload could be you know, whichcryptocurrency are you mining? And we have something called deep mine, which isa ML based recommendation engine which would tell the customers they can opt in touse it if they wanted to. Tell them hey, based on a coupleof signals which we can dig into later, we recommend that the most profitable coinfor you to mind right now is x and then we could automatically,using our infrastructures, switch their entire fleet over to that coin. And wedo that every thirty minutes. We calculate the profitability and then, and thoseare based on AI models. I think the the most natural fit for blockchaininto Ai. So the reverse of that is around Iot. So if youlook at the massive amounts of data that's being collected on the edge, it'sa high volume, low value data, and so being able to put thaton an immutable storage is critical for ai so you can retrain models, reproduceresults, and I think that's where, at least in my mind, Isee the overlap of blockchain providing utility to...

AI. Well, so you're sayingthat the data ingestion of a lot of these ilt devices going into some blockchainas a storage storage mechanism set right. Correct. Correct, because you needa little value immutable storage, but you need a lot of it because itmight be temperature sensors, and so one reading getting lost is not the endof the world, but you do want to make but you're going to havea ton of readings and from a ton of places. You need to distributeit it. You know, immutable storage for sensor data and immutability helps scientistsreproduce results, so you don't have to worry about temperatures being jet boat afterthe fact for exas advantage of that. I understand the concept there, buta disadvantage of trying to pump a bunch of data to apply chains state bloat. If you look at one of the current, I guess, bottlenecks ofthe theory of network it's how much state management currently exists and how much youhave to kind of get through in order to just become a running full loadacross the network. And if you'd like to run something like an archived oatwhich handles all historical data, then it's it becomes severely limiting in terms ofthe resources required to run these things. And Yeah, that's just dealing withlike, you know, smart contract data and financial data, where you're reallyspending a lot of time optimizing what you're putting there. You start dumping abunch of stuff like Iot data, that's going to get exacerbated terribly. SoI'd imagine something along the lines of like using a blockchain for your mutability butstoring a good portion of that data elsewhere and hashing it so you had likeyour routing. Your routing, your immutability is something like a blockchain, butI don't think you can scale alt data on a bottom blockchain, my opinion. Yeah, and it's in again. This is the the ailment of bothblockchain and AI. There such used to expression suitcase words. They mean somany things. I would absolutely agree that putting that on a theorem or well, the bitcoin network is an overkill. They have much higher primitives and capabilitiesthat won't be needed. And, you know, arguably, having to generateconsensus across this type of data shouldn't be such such mission critical, also missionecritical, compared to other use cases. But there's two ways to solve that. One is did dial down the tolerance of consensus to use a simplified ledger. The other is, as you mentioned, to keep the transactional aspects in adistributed ledger but keep the storage elsewhere. Either way, as long as youknow the quote unquote off the chain deployments, as long as you're ableto reasonably control the data and be able to account for the origins, Ithink they're still utility. there. That makes like I want to kind ofdive a little more to what you said, like as as the underlying use case, what is the issue with? I guess this. This also dovetailsthem, something I wanted to talk about later. Is it like what you'redealing with machine learning and a data the data that you're ingesting is incredibly important. The whole mentality of carbon, gin garage out. You can create modelson on data all day long, but if, if what you're trying toget out of it and the data you're feeding into it don't aren't reconciled verywell, you're going to get shit models that don't give you any production prectingpower. And so you would what you're looking for at least trying to leveragethis other technology as a way to have stronger confidence in the quality of thedata before when you're justesting. Yeah, it's absolutely like universal to the machinelearning industry because it is the oracles. I guess you can call these oraclesa severe problem. Yeah, I mean, you nailed it, and it's nodifferent from from human minds. You're a function of whatever data you exposedto. So same thing with neural networks. If you have data sets that areskewed or you have class imbalanced problems, then your predictive models will produce thesame problems. And so it's important to have stable data sets that havebeen curated and publically available, distributed so you don't have to move large amountof data around the world, and stable they need to be immutable so thatyou can reproduce results. If you have concerns that the data sets might bedrifting, then it will be really difficult to produce results and be able todo hyper parameter tuning on your models. So yeah, I mean that's that'sthe use case I see for blockchain in Ai and, like I said,the reverse. Obviously, we spoke about more. So I had a quickquestion based on something you said previously.

One of the first answers when Iasked you, which was related to you said that you guys, every thirtyminutes will check which cryptocurrency is the most profitable, to profitable to mine.Yeah, and then switch to that one. With me with maybe this is becauseI'm a bit naive on mining and everything with me, all this softwarethat does that, as always seemed like I feel like the switching and overheadcosts are way higher than the benefit is of actually switching. Like, whatare the actual costs associated with switching, and does it still make that muchsense to consistently be switching between which cryptocurrency or mining? Yeah, now,that's a great question and you absolutely right. There is a opportunity cost involved inswitching and that needs to be modeled into your parameters. It's it's actuallymore perplexing than folks think. It could be anywhere from how you actually switchthe mining and if that incurs dial incurs downtime on the machines right, soyou might lose hash rate for a couple of seconds. That needs to bemodeled in. You also need a model in the ability to liquidate the orderson the exchange side. So I think your point is hundred percent accurate that, if not done correctly, you might actually end up hurting yourself more thanhelping yourself. But if you do it correctly, you can factor all thoseparameters in and still make a decision that makes sense from a financial point ofview. We've been running this, you know, since I've been here andyou know we've seen pretty substantial lift. We're only switching between three coins rightnow, so BC, HBSV and BTC, and we did explore PPC at onepoint, but that has so much hair around consensus and getting confirmed blocksthat we we jettison mining with that. But those are the three we switchbetween and you know, truth be told, a lot of the opportunities between thosereally come in the fact that Bitcoin has obviously a two week difficulty variantsand then the other coins have, you know, almost transaction base. Thatcreates these gaps between a slow difficulty change on bitcoin and a very dynamic difficultychange between the other two coins, and that creates the spreads that, ifyou're smart, you can take advantage of. That makes sense as to why youtry, and I'm start with those, because you have very specific mining hardware, that and and, and then the underlying algorithms that take advantage ofthat hardware are pretty one to one. Mapping. There not a lot ofvariance across the different different chains, right, and then you have but what youso like. You have an you have kind of this large variance ofthe cross how difficulty changes, crosses networks and gets associated price fluctuations. Sotraining on that alone makes sense, whereas trying to bring in other things likeetherium, Gpu based finding across the board, which is a myriad of other coins, drastically increases the amount of things you have to ingest and make decisionson correct correct you know, okay, it's like, is that something I'lllook you'll looking into getting into, because I know you do more than asick mining at core scientific. Yeah, so most of our fleet is asick, you know, shut two hundred and fifty six based gear. Wedo have GPU customers that have GPU gear, but you you know, you hitthe nail on the head the once you're going to GPU and you startswitching algorithms it. You know, you change the power consumption of the gearand these coins have, some of them has such low liquidity that is reallyhard to you know, you might make a colation that said coin is themost profitable and then when you try and going clear and order and an exchange, you can get enough buyers. And so, you know, we're tryingto stay away from some of the fringe exchanges and we trying to focus oncoins that we believe have enough liquidity to be able to do the actual liquidationas well as just switching the you know, under GPU side there's a ton moredifferent coins and you know, but they I don't feel there's the samegrubby tasks around a certain few coins on the GPU side. That's worth exploring, but we have looked at it. Answer your question, if definitely lookedat it and our algorithms can consume these signals. But, like I said, the majority of our fleet is shut...

...to five hundred and sixty. That'scoming. Next question is, like what all like the types of things thatyou are ingesting as signals for these things to make decisions on? Like itseems like as a this is mentioned earlier in the Bitcoin podcast episode. Allthe way down from computational power, energy, price market signals, social data,so on and so forth, there's a tremendous of the things you canadjust. What are you what are you seeing as like the most useful msuggestion indicators for figuring these things out? Yeah, I wish, I wishI could tell you some super interesting story of you know, it's tied correlatedwith you of the amount of Blah Blah Blah. You know, it's realityis that we do look at a lot of signals, but you know,the good old trade volume is still one of the strongest drivers in terms ofa feature in predicting price. The you know, we look at our firstparty signals. Obviously, managing a fleet the size of what we do givesus a ton of direct signals, as you mentioned, for cost, onthe cost side with power and and then we look at some third party signals. Obviously exchange crisis and difficulty block rewards. You know, the usual ingredients youwould put in place to calculate profitability. But the answer your question you know, at least in our experience, one of the largest you know todo. If you do the principal component analysis, the biggest contributor to ourforecasting is really trade volume. There's some macro trends, like the rainy seasonin southwest China. That does play a role, but we done model thatin our models. We just kind of do that as an overarching adjustment.So I guess you mentioned predictions and like a I mean it's quite popular,popular opinion that Crypto is volatile and not very rational. Does that is so, like judging by your models in your predictions, do you agree with thatstatement? Like can you actually predict things really reliably enough the based on theseinputs that you're taking, which, as you said, is like volume andstuff, are these actually good indicators of how the price will be affected inthe future? Like I said, yess no, like, like what's thetime horizon of those predictions, because that's going to get works of worses thatyou go, like as you go further and further into the future. Correct. So if you guys, if your audience could see me, they'll seethe gray hair from trying to predict cryptocurrencies. Yes, it is completely you knowit. We you know the predictions. We do our short term and thatallows us to mostly try and forecast whether we believe we're on an upwardtrend on a currency or a downward trend. Then we use that to set theorder floor prices. So we believe the Bitcoin is going up, weregoing to try and increase our clearing prices on the exchange to just above asksso we can kind of step up everythink the reverse. Obviously we go slightlybelow, but that is but that's about a seven day out forecast. Twohundred and twenty seven day outs. Once you try and go further, it'sit's you know, almost you know crazy town. It's this so many weirdfactors that drive the price and the you know, you just can't get signalsaround some of these. Some giant whale goes and sells a lot of bitcoin. You know the signal. You get a straight volume, but you don'tget anything ahead of that. So if you do a short enough prediction,I think you can still get utility out of it, but you know we'vetried to run models out further into the future and it's pretty difficult to geta good read. Pretty but that being said, like, how do yousee the marriage of maybe, I guess, cryptocurrency trading or prognostication of price evolvingover time? Where do you see it going? Do you see itgoing until I can incorporating more and more assets? To see it in tryingto narrow down different sources, better machine learning techniques, better computation for like, training models faster, like I. What do you see like the likewhen you think about the advancement of machine learning at ai as it pertains tothe work that you use it for? But what's the next like, whatare you excited about? Yeah, so, so real quick on the blockchain sideand then I'll switch to just pure...

...play ai son. The blockchain side, I think you know the one approach is to go after more signals.But again, I think there's some events that impact the pricing that you won'thave signals to unless you some all, you were able to see into theminds of somebody that decides to drop a bunch of coin on the marketplace.So you know, I think they're it's really about the coins themselves stabilizing andbecoming less fault less volatile, which they're going to have to do anyway ifthey want to, you know, be a quote unquote fiscal currency. Onthe on the pure play a eyeside, I think the sky is the limit. I mean we've we've obviously worked with net APP and Nvidia on Covid nineteenresearch. You Know Ai, to go back to my earlier common ai isis a suitcase term, so it means a lot. So, you know, you could you could consider ai to be anything from a simple linear regressionin a spreadsheet, which a lot of companies are doing when they do salesforecast, and then you can go slightly more complicated and say, okay,deep learning and the two primary use cases there are computer vision and natural languageprocessing. On the computer vision side, we seeing a lot of companies usethat from any anything in the life sciences. We see computational pathology, where youusing machine learning models computer vision to diagnose. We're seeing manufacturing defect detection, we seeing losh loss prevention. You know, the most reason we workingwith this company in Europe. Most recent use cases, which I thought waspretty interesting, is an apparently it hasn't hit the US yet, so Idon't want to give people ideas. But in Europe they're stealing these small atmsthat you find in like a seven eleven, and so the way they do itthey drill a little holes into these, you know, self enclosed units andthen they inject some sort of gas into it and then they ignite thisgas, and that's basically how you crack open these these ATM boxes. Andso with these guys, this company, is doing, they're basically detecting thesecanisters and so whenever they see somebody in the ATM camera and they see thesecanners has come out, they you know, notify the police. You know,there's a very interesting use cases for pattern recognition in computer vision and thenobviously you have the slew of other use cases anywhere from crowd control social distancing. On the computer vision side, natural language processing is probably the next biggest. There's a lot of innovation happening there. It's much larger models being trained.See you think about birth, gpt to and that those use cases arefrom a robotic process, automation, chat bots. So whenever you act witha chat butts, that's natural language processing. It uses an language model to figureout how to speak back to you. There's also, obviously the call centers. We working with a customer that is doing an enormous amount of speechto text and what they're trying to do is when you call into a callcenter, they want to listen to the customer conversation and say, Oh,this is a conversation about x, Y Z, and then be able todo ai models to come up recommendations for the for the call center person tosay, Oh, if the customers talking about this device, then these aresome of the issues we've seen before right. So really helping close that close thatknowledge gap, if you will, on the call center side. Sothose are, you know, kind of the three big blocks. When youlook at what people think of as Ai, is the you know, Cyberdyne Terminator, completely autonomous entity. I think those it's referred to as general artificialintelligence. I have not seen that for a while. That kind of willfor that one and if that's what you mean with Ai, adoption then it'svery low. If the other two categories is what you mean with Ai,then I think it's pretty ubiquitous and and it's, you know, it's beingwoven into a lot of software. Like if you look at your office suitesor spring a sheets, they all now have little widgets that help you withOh, are you trying to do x, Y or Z, or hey,we think you're trying to write a letter. Do you want to BlaBlash? So that's all effectively ai or machine learning. You know something thatI think a lot of people don't quite...

...understand about that, even further thanthe differentiation you just made, is the fact that these models are trained andthen compacted in a way and then and then they live on a client sideso that they're not like you would you do but release for some of thesethings, like they think about like. So think to some of the textrecommendation for your chat or how your series talks to you, things like this. It's a lot of the times the actual processing and recommendation is happening locallyon your client and not communicating with some central server that's then doing it andsending it back to you. And that's I think a big part of thepush or innovation and recent machine learning and and ai is is getting these thingsto be useful in such a compact size that they can be done on mobiledevices and it's on and so forth. Is that? Am I interpreting thatcorrectly? Yeah, absolutely, and and that's the other pivot of quote unquoteai machine learning. There's the model training aspect, which is the way youneed the big eye end devices, GPS, lots of power, lots of data, lots of crench crunching numbers, and then they produce, you know, let's take an Imagenet, hundred and eighty gigabytes in images and it maybe produces a model that's, you know, ten megabytes and that and that tenmegabytes model then needs to be deployed on the inference inside or the edgeside, which is where most of their real world applications happen. So youright if if it's a chat but the model was trained in one facility onone set of hardware, but when you actually chatting, you talking to theinferencing side of it, and and you're right to run that on the edge. It needs to be compacted quantities, as they say, where you reducethe precision of the weights of the model to get it to fit on smallergear. And you know, you looking at different hardware. You looking atpotentially, IFPGA accelerators versus general purpose gpus on the training side, but you'reright right on that's that's another key that that AI is maturing is that it'smoving away from lab projects, which is mostly training model and testing accuracy ofthe said models, to actually production deployments. We actually put these these models intowork and those guys. It's all about inferencing, it's all about theedge. Is there something that you're most interested in? Across like it's like, like you said, the AI is a suitcase term. It's been incrediblybroad spectrum of like very very deep specialities like similar like web development or datascientists or you know, what have your what other suitcase terms associated the sametype of stuff? Is there something that you're specifically interested in? Yeah,you know that. I think this kind of two schools are thought. Theone is that AI eventually replaces about a lot of what we do, butI think a long time before that happens, it will enhance what we do.And this could be anything from, you know, going back to myterminator references, and I'm not a Terminator Fan boy, or maybe I am, I guess. If you recall when they showed how the computer was lookingat stuff, how it had all these hints show up. So imagine aworld where you have wikipedia and your you know, wish bag and call andwhenever you look at something it gets recognized and you can look at a carand I'll tell you the model and you can see all that. So it'sthis this concept of augmenting what we do, not replacing what we do. Orif a doctor is doing surgery, they have all these KPI's, almostlike a heads up display, available to them. Or if you working inmedicine, our ability as humans to recognize patterns is phenomenal, but the amountof data that's needed these days. You know, AI is just great atrecognizing patterns. That if you look at the work that Google's doing with deepmind and other spaces where they can look at an iris scan and diagnose diseases, which you know as phenomenal that these signals exist. We just weren't ableto recognize these patterns. I think ai is going to help a lot bymaking us as humans better, stronger, faster, and then you know theywill be, I think, quite a good jobs that that AI will subsumeas it gets smarter and more a general purpose. But you know, Idon't know if that's in our lifetimes or our kids lifetimes. But what Ido know will be in our lifetimes is ai that makes us better and makesus more efficient and helps us understand stuff deeper and more thoughtful it. Definitelysee that and agree with the quite a bit, but I'm kind of seeinghere that's I would call that a marriage...

...of like if I think about thecurrent expential technologies that are in our hands, it's that's more of a marriage withmachine learning, AI, with ore like that. What's an augmented realityand virtual reality exactly? That we can it's so because of that you seethose two there's an obvious connection between those two technologies and how they kind ofwork well together. And then when you start to bring that back into whatwe're trying to do in the blockchain space, it it's kind of trying, Iguess, preventing the Dystopian manipulation of that world once it exists. So, like say, if you to imagine a world in which you have agood portion of how you view the world and make decisions and interpret things isdone through the assistance of some type of augmented reality that is taking advantage ortaking advantage of a lot of machine learning algorithms that are are placed in whateversoftware using right, how those things get created and how they ingest data andthen how they kind of feed it to you, are all kind of entrypoints into manipulation. Yeah, my shape or form, and so I feelas though like a potential use case for whatever we're turned do in the blockchainspace is trying to give stronger guarantees around that data, whether it be whatyou're using to train the underlying model or so or I guess the methodology iswhich you like absorber or give out whatever whatever it's doing, because that sortof reasonable expect like, is that kind of what you're talking about earlier interms of we need to have real strong guarantees about the data were using todo these things, because I think when I was talking on the Bitcoin podcasts, we were discussing the importance of price data and the feeds that you're thatyou're ingesting into kind of what coins at what price and historical things over timethat inform you on what to then mine and make decisions on, and that'sa very important thing that cannot be manipulated if you're going to make a gooddecision at that'll like so that's the kind of the underlying oracle issue that kindof comes up. But if we just expand that across the board, havingvery good data it's a very important thing, and how that data gets stored,verified and then move to the things that are actually going to use itis incredibly important. I think that's where blockchain maybe comes into play. Yeah, now, absolutely, that's that's so. If you imagine going back to thataugmented reality or the the because machine learning can learn patterns so fast,you absolutely right. If you have broad application of the in of this augmentedreality or ability to augment what we do, how we understand things, and allof a sudden you have a lot of people running on models that werebiased one way or the other. I mean it's pretty dangerous in a waythat you could have broad biases be replicating out by models that were trained onbad data and it will be so woven into what we do it'll be hardto detect. You know, one of my favorite books to plug is thethinking fast and slow, and it talks about human minds ability to almost notbe able to distinguish truth as long as it's ubiquitous. So basically, ifdoesn't matter how crazy and idea is. If you have enough social affirmation ofthat concept, you would experience it as full and truth. And so gotake that with bias models and you really create this short circuit where you knowyou could have really things go really haywire. And so, because of that,I think models are going to become almost like crypto keys or stuff thatget managed and trained by groups that have made sure that the models, thatthe classes, are balanced within the models. And if you do have models thatare doing inferencing on on live data, there's a lot of technologies that lookit's called data drift, where you just making sure that your incoming datasets aren't varying so much from this stuff that was trained on that you're goingto start making pretty much random, random prediction. So yeah, it's garbagein, garbage out, on steroids with with with machine learning, so youhave to be very careful. That's interesting. So you mentioned the sharing of modelsand models would be managed by a group of people. It is.They're already examples of this happening, like open sourcing AI models. Yeah,yeah, already. Yeah, absolutely.

So most of the big models areeither trained by academia or some of the large company. So if you lookat you know, the first you know, major natural language processing model, lm. That was trained by the Allen Institute, Bird was trained by GoogleGpt, and Microsoft did a new one, the transformer model. That's the largestmodel. So and these models are shared and then what most practitioners dois they they then go ahead and Shune these models. Very few companies actuallytake or Imagenet, which is a famous computer vision model. Very few companiesstart from scratch. They usually take one of these models and what's called tuneit, fine tune it, where you saying, okay, this model knowshow to detect visual objects, but maybe it doesn't know how to detect ina boat, for example. Just making this up, and so you canthen tweak that model slightly to be able to learn to detect boats. Andso yes, the answer your question, a lot of the bigger models for, you know, power and consumption and cost reasons, are done by afew companies or academia, but then they are used by a lot more,you know, general purpose folks out in the field. And so in away that's already happening that you can imagine that that's just going to get moreand more and more ubiquitous as new types of models, new patterns need tobe recognized, comes along. It's off the wall. Like five years agoI published a paper in quantum computing about when a quantum computing but quantum dynamicscalculations, working with some modeling quantum mechanics on classical computers and I had adaptedneural networks to be used as a computationally efficient interpolation system. So, like, I guess excess to give a bit quick prefix, there's these incredibly large, complex multivariate services that are required for these types of calculations on supercomputers,right, and every time you evaluate one of these functions that has a lotof variables in it, it takes a significant amount of time in order todo the calculations. These things seem to be these functions need to be evaluatedmillions of times, right, right, and they become somewhat of a limitingfactor and how long it takes you to get the job complete on the supercomputers, which is incredibly expensive. And the goal of the paper was touse machine learning, specifically neural networks, to create a sort of working forbasically a computational efficient function that gets evaluated very, very quickly based on thelarge the large surfaces. Is that something you see in the ecosystem and inother ways, like using neural networks as interpolation engines as opposed to trying toprognosticate based on some Corpus of data? Yeah, I mean I think it'stractable that we get more adoption. I think if you look at at thestate of some of the model architecture as I mean, we're still seeing newmodel architectures come out and so I think we've far from having dialed in exactlywhat the best activation function is or what the best architecture is and should youdo recurrent or convolutional, recombination or in a so I think there's a lotas we can still learn and then look a lot of new applications for neuralnetworks. The I think as we as we get more real world applications,I think it's going to require different ways of thinking of models. You know, the one thing that was quite interesting, speaking of an out of the boxuse case, was this rice university did a hash based implementation of deepneural nets called slide slide, and they basically deviated from Jeffrey Hinton's back propagationapproach to figure out the weights of a neural net and and so that wasa completely different approach. It doesn't require gpus, it doesn't do a lotof matrix multiplication. Instead it does a lot of Hash table lookups. andYou I can imagine that with the innovation in hardware and in new network architecturesthat you know, things are going to get. You know we will bein five years time we'll be looking back...

...at some of these models and go, how, remember when we used to use IMAGENET or confinets? How youknow how primitive those were? This is what we doing today. So yeah, I think there's definitely going to be new use cases that that come out. That's about enough. I could probably talk about this for a while,but without diving you're going down too many tangents. Is there? Are there, then, questions that you wished I would have asked that I didn't.Now I think you guys did the great I mean, I'm really excited withwhat the AI space. I think it is a transformational technology. It's Andrewingsaid it's like electricity and I believe that it's going to be transformation. It'sgoing to be in everything we do in one shape or form. It's goingto be very powerful, which means you could be used for good and bad. So we're going to have to keep tabs on it, just like youdo with any technology, and I think it's going to it's going to helpus some solve some challenging problems it. You know, if you if youlook at the pace at which ai can learn versus the pace at which we'velearned. I actually wrote an article, a blog post on this, butyou know, if you look at how evolution allows you to learn and howyou could only learn through lineage, and now we then we started writing,and so you could share stuff between folks and other people and learn and havememories of Oh, you know, don't stick your hand and alliance mouth becauseit'll bite it. Okay, that becomes something that you've learned from prival anderror. But with what deep learning, we can continuously learn, and soyou know that that in and of itself is amazing. Of what we'll beable to do by having you just continuously learning, learning, learning, observing, recognizing new patterns. So I'm really excited about AI, which is whyI work in this field. I'm excited what role core plays and and andpowering data scientists to solve these problems. And so yeah, I mean Ithink you guys did a good job. Is there anything else you'd want toask? Who want me to dig in more for your audience doing? Imean, I there's so many stupid questions I could ask about ai because I'mgenuinely interested in the field but don't know much about it. And what then, even stupid audience member and ask those questions. Okay, so let's startwith the with the one which is like recent, gptthree. Right, what'sthe big deal? So when you get to these language models, it's,you know, really it's called embedding word embeddings. It's trying to learn therelationship between things in an almost multidimensional graph. And so typically, whether it's bird, gpt two, gpt three, Elma, the issues there is juston how many parameters you've trained. If you look at the brain, Ithink it's a hundred billion synapses. So think of those as parameters. Andso gpt three is just a very, very large model that was trained,and so the conjecture is the more parameters, the more complexion model is, themore advance it could be in detecting and this cause language models. Yeah, so that that's basically the thesis around around these natural language processing models.So bigger is better? Is gpt three the most advanced one right now?I think GP I'm not. Actually I'm not sure whether there's one that Microsoftrecently did, t something. I can't remember the acronym, but they thatare transformer based model, and I'm not sure if g PTTHREE is actually largerthan that. Okay, that you could just look up the number of parameters. Like I said, the reason this is limited to these big companies isreally who has access to that much computational power? Yeah, you'd be ableto train these models, but yeah, it's a toss up between GPD threeand TPA or something. Can't remember what Microsofts One's called. Then I'm kindof surprised that like the model gpt three came out of open ai write,which is like it's the Elon must company, I think right correct. Yes,I'm kind of surprised that stuff like that, like is Google doing opensource stuff like this, because you think they have the both the computational powerand the data sets required to get something equally, if not better, offthe ground. Yeah, now, they...

...they did. I think gpt twowas done by google. I'm not mistaken. Okay, yeah, they so,you know, the like the language models. They all have funny names. ELMO, bird, gpt to gpd three and then Micros, Microsoft one, which, for the life of me I can't remember. So yeah,they they do come out of either academia or or a non nonprofits. OpenAI, believe, trained their model on Microsoft, as are infrastruct share,because Microsoft invested a good portion of infrastructure into open Ai. But yeah,these, these are great, great for natural language processing and it's funny,we've kind of stopped innovating much on computer vision. I think people feel thatthat's at a reasonable place, and now everybody is innovating on natural language processingmodels and you know, I think what will happen is we will learn aboutnew approaches that we might circle back and go play on computer vision again.And so, yeah, I think this is going to continue to get weregoing to continue with new architectures and and finding ways to shrink them down torun an edge devices. To that end, it seems really important to have peoplelike you at infrastructure companies trying to figure out like what power computation devices, architectures do we have available to trade these various things and where they mostefficient? Like, like you said, like people can't do these things unlessthey have access to a specific amount of resources to do it and or orif there's resources, are even available broadcastings that people who would like to dothem. And without writing someone like you and your position, it's really hardto make to bridge that gap. Are you seeing your competitors, other peoplewho provide these resources? Higher people who are interested in machine learning, researchers? It just part of the deal. Like you don't run a company likethat unless you have someone like you involved. Well, to the first part ofyour statement, absolutely true. You know the example that comes to mind, as we did initiative with MIT in. One of their researchers trained what's calledthe Big Gan generative adversarial network on our infrastructure and that was the firsttime somebody trained a network outside of Google. So absolutely our goal is to helpbring this capability to to all data scientists and and yeah, we wantit to be democratized to make sure that everybody can can have access to thisinfrastructure in one shape or form. It is actually much harder to manage thanyou then traditional, you know, Dell servers, example, and so thereis an expertise around managing that, around sharing them, making use of them. The the worst cardinal sin you could do is is by these expensive infrastructuremachines, these servers, and then not be able to utilize them and havethe you know, idol gpus or fifteen percent utilization, especially since you paidsuch a fifty price for them. So we make sure that people can utilizetheir infrastructure the best possible way. It makes all sense. That reminds thesince me back into almost PTSD, of handling job queues on the SUPERC forscientific applications. It's like there's so much power not being used right now.What's going on right exactly? It's brutal. You know the this this researcher madea comment in the the challenge of courses. Unless you have fully acceleratedfabric right, everything from storage to networking to the you know, whether youuse envy link and re switch. Your pcie expressed that whole pipeline of gettingdata onto that silicone onder GPU has to be accelerated or you're going to endup with under utilized infrastructure. And you know, do you have ten percentunder utilized GPU. That adds up a lot. And so with our withour software called PLEXUS, when you run a workload, it will make arecommendation to you to make sure you're not going to run on Gpus that aregoing to be half utilized and you're going to pay an arm and a legfor on a public clob provider, for example. It will it all right, size the infrastructure for you to make sure that you not overpaying. Andor if you run a job it'll tell you, listen, you have ampleCPU cycles to spare. You know, maybe consider moving some of the computeon to the CPU. So just, you know, to the Meta pointof how do you maximize your investment and...

...make sure you don't have I doorcycles, because these these you know, as amazing as they are, invidia releases in new GPU architecture every two years and so you need to makesure you get the most out of that. And that's you know, that's somethingthat the the mining folks know a lot of you know when I startedwith core, as nines will all the rage, and now you know asnines are struggling, and so you know you have to maximize the value youget out of this infrastructure in the two year window of which it's, youknow, considered state of the art. And that's another thing that we seean overlap between blockchain and aim. That's really interesting. I could probably wholeother episode in itself is trying to discuss the the linkage between hardware architectures andthe software that run on them and then how you figure out right, howto maximize that. Yeah, because it does. It's not like your PC. I mean, I think my pcs well, I just got a newone, but the one I had before that was, you know, eightyears old. Eight year old GPU is antique and if you're old, miningis the thesist alost right, so that it's a whole. It's, youknow, it's make the most while you have it kind of game. Allright. Do you get everything else here? No, all right, and Ireally appreciate you coming on a kind of chat with us about this stuff. It's it's quite fascinating a day on this side of the conversation, whichyou don't hear much of. So thanks. Thanks for coming on. Maybe wecan have you back to dive into more stuff. Sounds Great. Thankseverybody. You.

In-Stream Audio Search

NEW

Search across all episodes within this podcast

Episodes (109)