Hashing It Out
Hashing It Out

Episode 90 · 2 years ago

Hashing It Out #90- Core Scientific-Ian Ferreira

ABOUT THIS EPISODE

On this episode Dean and Corey have Guest Ian Ferreira of Core Scientific cover the usage of AI in the blockchain spaces and building efficient mining architecture.

Links: Core Scientific

The Bitcoin Podcast Network

Hey, what's up? So Avalanche, let's talk about it. What's an avalanche? Snow comes down real fast, fierce gains momentum. But I'm not talking about the natural disaster. Or if it's not really disaster, I guess it no one's around. But anyways, avalanche. What is it? You've heard about it, now you're gonna hear some more. It's an open source platform for launching decentralized finance applications. Right, defy, that's what you want. Developers who build on avalanche can easily create powerful, reliable, secure applications and custom block chain networks with complex rule sets, or build an existing private or public subnet. Right. I think what you should do right now is stop what you're doing, even if it's listening to this podcast. Stop, pull over, go to the gas station. If you need to go to a subway, there's a subway, like everywhere. There's always a subway, all right, all right, there's always a kroger. Just stopping a parking lot somewhere. All right, going to the dogs are adamant that you stop. Go to Alva, LAV ALVA LABS DOT org, to learn more. All right, stop, go to Alva labs. That's a VA labs, labs dot org. Now Entering Talk Yea K. welcome to hashing it out, a podcast where we talked to the tech innovators behind blocked in infrastructure and decentralized networks. We dive into the weeds to get at why and how people build this technology the problems they face along the way. Come listen and learn from the best in the business so you can join their ranks. Welcome back to hash it out. Your hosts, Dr Corey Petty, with my co host, Dean Eigenman. Today's episode we're going to be talking with core scientific. Recently had a podcast at them over on the Bitcoin podcast with a different person from a company, and we wanted to get a little more technical, so we brought over Inferrara from chief the chief product officer of course scientific. Ian Do the standard thing and kind of tell us what you do, where you're came from and how you join the space. Hey, good morning. Yes, I'm Infrara. I'm chief product officer at course scientific. been with course scientific approaching two years now. Before that I had bounced around a couple of other machine learning startups and it's been a decade at Microsoft working in this search team. So I've been around the algorithms and big data distributed system space my entire career. Mostly what you do now, like what kind of brought your core scientific and what do you do there? So, course, scientific was interesting for a couple of reasons. One is I definitely wanted to focused on AI, so that was one of my criterion. The second was it's a very different approach. So a lot of companies, if you go and work in an AI roll, you're going to start from business problems downwards and kind of make your way to you know, let's say tends or flow kind of that. You know, tends to flow by torch layer up the stack. And what was unique about course, scientific is we were starting from the bottom, so we were starting from concrete power, TDP chipsets, you know, understanding and to connect said, really an opportunity to get the under the hood experience, if you will, be a hardware experience, if you will, of Ai and then work your way up so that once you've done that, you have a full picture of everything from okay, this is going to use this library, that's going to take advantage of this silicon feature that's going to be accelerated by this hardware infrastructure and blah, Blah Blah. So it just gave me a really unique opportunity to work from the bottom up. That's really interesting. So you're like because for those throt of core scientific is, I guess you boast yourselves as a infrastructure of company. Provide a lot of resources for people to do a myriad of things that, for properate need compute power, part of that being mining to various cryptocurrencies, as well as like machine learning and so on and so forth, and Ai. It's interesting that you take it from these are the resources that we have, these are the architectures that may be fit to these different types of algorithms that are applied across the board. What does that like, even that perspective like? How do you how do you approach a problem with what is that kind of what have you learned from that? Yeah, so, as you mentioned, course,...

...scientific provides hosting an infrastructure as well as software services for the two primary categories. One is blockchain and the other AI. And the blockchain side were lowered down the stacks, you know, somewhere between data centers a service and infrastructure as a service, where we host mining gear for customers. And then on the a side we much higher up the stack. So we're pretty much a pass platform as a service, and we'll talk about that some more it later, but those are the kind of the two differences. And then we have, you know, some synergies between the two. If you start at the facilities level. Now the common thing with crypto and an ai gear is high heat, high power. So our facilities are typically much higher rated than you would find on a traditional data center. And the other aspects is around controlling heat and so making sure you can deal with these machines that don't fit in normal standard racks that you might be used to. So that's the infrastructure tier. And then we did a couple of things around using algorithms, around a workload placement. We do that both in ai and on blockchain. So, as you can imagine, if you look at on the blockchain side, a very common workload that you might want to figure out what's the optimal coin to mine, right. So how do you do that? You have to figure out a bunch of algorithms and a gradients to make a decision what to mine, and we can we can talk a little bit more about how that works and the same thing on the a eye side. You might want to run a large training job and you might want to know is this better to run an Azure? Is it better to run on our infrastructure core, is it better to run on aws? Because again, you have the same equations. There's a cost and there's a compute capability and you have to figure out what's the optimal for the customers need. So that's kind of the software over lap that we have between the two verticals. That makes sense. You kind of just answered my or partially answered my next question, which was why blockchain and AI? It seems like two very abstract, different things. Why not focus on one? Why did you guys decide to do both of them? was there that much overlap that it made sense or yeah, that's a great question. I think the company started, before CEO came and Turner joined, the company was primarily focused on blockchain infrastructure and I think he saw the opportunity to expand into ai because of that similarity. And you know, we've you know, AI business is much more nice and then blockchain, but we've grown pretty substantially and, as I mentioned, we've gotten much higher up the stack. You know, the giving people this cloud experience to gain access to this infrastructure. This is on the blockchain side, with more of an IASS or or data center as a service, where we managing the infrastructure for customers. Do you are there any use cases where you use let's say the one vertical in the other? For example? Is Their use case for some of your blockchain or for some of your AI infrastructure in your blockchain product, or are they completely orthogonal? Now it's a great question. Absolutely so, going back to the earlier you know workload placement optimization. In the case of Blockchain, the workload could be you know, which cryptocurrency are you mining? And we have something called deep mine, which is a ML based recommendation engine which would tell the customers they can opt in to use it if they wanted to. Tell them hey, based on a couple of signals which we can dig into later, we recommend that the most profitable coin for you to mind right now is x and then we could automatically, using our infrastructures, switch their entire fleet over to that coin. And we do that every thirty minutes. We calculate the profitability and then, and those are based on AI models. I think the the most natural fit for blockchain into Ai. So the reverse of that is around Iot. So if you look at the massive amounts of data that's being collected on the edge, it's a high volume, low value data, and so being able to put that on an immutable storage is critical for ai so you can retrain models, reproduce results, and I think that's where, at least in my mind, I see the overlap of blockchain providing utility to...

AI. Well, so you're saying that the data ingestion of a lot of these ilt devices going into some blockchain as a storage storage mechanism set right. Correct. Correct, because you need a little value immutable storage, but you need a lot of it because it might be temperature sensors, and so one reading getting lost is not the end of the world, but you do want to make but you're going to have a ton of readings and from a ton of places. You need to distribute it it. You know, immutable storage for sensor data and immutability helps scientists reproduce results, so you don't have to worry about temperatures being jet boat after the fact for exas advantage of that. I understand the concept there, but a disadvantage of trying to pump a bunch of data to apply chains state bloat. If you look at one of the current, I guess, bottlenecks of the theory of network it's how much state management currently exists and how much you have to kind of get through in order to just become a running full load across the network. And if you'd like to run something like an archived oat which handles all historical data, then it's it becomes severely limiting in terms of the resources required to run these things. And Yeah, that's just dealing with like, you know, smart contract data and financial data, where you're really spending a lot of time optimizing what you're putting there. You start dumping a bunch of stuff like Iot data, that's going to get exacerbated terribly. So I'd imagine something along the lines of like using a blockchain for your mutability but storing a good portion of that data elsewhere and hashing it so you had like your routing. Your routing, your immutability is something like a blockchain, but I don't think you can scale alt data on a bottom blockchain, my opinion. Yeah, and it's in again. This is the the ailment of both blockchain and AI. There such used to expression suitcase words. They mean so many things. I would absolutely agree that putting that on a theorem or well, the bitcoin network is an overkill. They have much higher primitives and capabilities that won't be needed. And, you know, arguably, having to generate consensus across this type of data shouldn't be such such mission critical, also missione critical, compared to other use cases. But there's two ways to solve that. One is did dial down the tolerance of consensus to use a simplified ledger. The other is, as you mentioned, to keep the transactional aspects in a distributed ledger but keep the storage elsewhere. Either way, as long as you know the quote unquote off the chain deployments, as long as you're able to reasonably control the data and be able to account for the origins, I think they're still utility. there. That makes like I want to kind of dive a little more to what you said, like as as the underlying use case, what is the issue with? I guess this. This also dovetails them, something I wanted to talk about later. Is it like what you're dealing with machine learning and a data the data that you're ingesting is incredibly important. The whole mentality of carbon, gin garage out. You can create models on on data all day long, but if, if what you're trying to get out of it and the data you're feeding into it don't aren't reconciled very well, you're going to get shit models that don't give you any production precting power. And so you would what you're looking for at least trying to leverage this other technology as a way to have stronger confidence in the quality of the data before when you're justesting. Yeah, it's absolutely like universal to the machine learning industry because it is the oracles. I guess you can call these oracles a severe problem. Yeah, I mean, you nailed it, and it's no different from from human minds. You're a function of whatever data you exposed to. So same thing with neural networks. If you have data sets that are skewed or you have class imbalanced problems, then your predictive models will produce the same problems. And so it's important to have stable data sets that have been curated and publically available, distributed so you don't have to move large amount of data around the world, and stable they need to be immutable so that you can reproduce results. If you have concerns that the data sets might be drifting, then it will be really difficult to produce results and be able to do hyper parameter tuning on your models. So yeah, I mean that's that's the use case I see for blockchain in Ai and, like I said, the reverse. Obviously, we spoke about more. So I had a quick question based on something you said previously.

One of the first answers when I asked you, which was related to you said that you guys, every thirty minutes will check which cryptocurrency is the most profitable, to profitable to mine. Yeah, and then switch to that one. With me with maybe this is because I'm a bit naive on mining and everything with me, all this software that does that, as always seemed like I feel like the switching and overhead costs are way higher than the benefit is of actually switching. Like, what are the actual costs associated with switching, and does it still make that much sense to consistently be switching between which cryptocurrency or mining? Yeah, now, that's a great question and you absolutely right. There is a opportunity cost involved in switching and that needs to be modeled into your parameters. It's it's actually more perplexing than folks think. It could be anywhere from how you actually switch the mining and if that incurs dial incurs downtime on the machines right, so you might lose hash rate for a couple of seconds. That needs to be modeled in. You also need a model in the ability to liquidate the orders on the exchange side. So I think your point is hundred percent accurate that, if not done correctly, you might actually end up hurting yourself more than helping yourself. But if you do it correctly, you can factor all those parameters in and still make a decision that makes sense from a financial point of view. We've been running this, you know, since I've been here and you know we've seen pretty substantial lift. We're only switching between three coins right now, so BC, HBSV and BTC, and we did explore PPC at one point, but that has so much hair around consensus and getting confirmed blocks that we we jettison mining with that. But those are the three we switch between and you know, truth be told, a lot of the opportunities between those really come in the fact that Bitcoin has obviously a two week difficulty variants and then the other coins have, you know, almost transaction base. That creates these gaps between a slow difficulty change on bitcoin and a very dynamic difficulty change between the other two coins, and that creates the spreads that, if you're smart, you can take advantage of. That makes sense as to why you try, and I'm start with those, because you have very specific mining hardware, that and and, and then the underlying algorithms that take advantage of that hardware are pretty one to one. Mapping. There not a lot of variance across the different different chains, right, and then you have but what you so like. You have an you have kind of this large variance of the cross how difficulty changes, crosses networks and gets associated price fluctuations. So training on that alone makes sense, whereas trying to bring in other things like etherium, Gpu based finding across the board, which is a myriad of other coins, drastically increases the amount of things you have to ingest and make decisions on correct correct you know, okay, it's like, is that something I'll look you'll looking into getting into, because I know you do more than a sick mining at core scientific. Yeah, so most of our fleet is a sick, you know, shut two hundred and fifty six based gear. We do have GPU customers that have GPU gear, but you you know, you hit the nail on the head the once you're going to GPU and you start switching algorithms it. You know, you change the power consumption of the gear and these coins have, some of them has such low liquidity that is really hard to you know, you might make a colation that said coin is the most profitable and then when you try and going clear and order and an exchange, you can get enough buyers. And so, you know, we're trying to stay away from some of the fringe exchanges and we trying to focus on coins that we believe have enough liquidity to be able to do the actual liquidation as well as just switching the you know, under GPU side there's a ton more different coins and you know, but they I don't feel there's the same grubby tasks around a certain few coins on the GPU side. That's worth exploring, but we have looked at it. Answer your question, if definitely looked at it and our algorithms can consume these signals. But, like I said, the majority of our fleet is shut...

...to five hundred and sixty. That's coming. Next question is, like what all like the types of things that you are ingesting as signals for these things to make decisions on? Like it seems like as a this is mentioned earlier in the Bitcoin podcast episode. All the way down from computational power, energy, price market signals, social data, so on and so forth, there's a tremendous of the things you can adjust. What are you what are you seeing as like the most useful m suggestion indicators for figuring these things out? Yeah, I wish, I wish I could tell you some super interesting story of you know, it's tied correlated with you of the amount of Blah Blah Blah. You know, it's reality is that we do look at a lot of signals, but you know, the good old trade volume is still one of the strongest drivers in terms of a feature in predicting price. The you know, we look at our first party signals. Obviously, managing a fleet the size of what we do gives us a ton of direct signals, as you mentioned, for cost, on the cost side with power and and then we look at some third party signals. Obviously exchange crisis and difficulty block rewards. You know, the usual ingredients you would put in place to calculate profitability. But the answer your question you know, at least in our experience, one of the largest you know to do. If you do the principal component analysis, the biggest contributor to our forecasting is really trade volume. There's some macro trends, like the rainy season in southwest China. That does play a role, but we done model that in our models. We just kind of do that as an overarching adjustment. So I guess you mentioned predictions and like a I mean it's quite popular, popular opinion that Crypto is volatile and not very rational. Does that is so, like judging by your models in your predictions, do you agree with that statement? Like can you actually predict things really reliably enough the based on these inputs that you're taking, which, as you said, is like volume and stuff, are these actually good indicators of how the price will be affected in the future? Like I said, yess no, like, like what's the time horizon of those predictions, because that's going to get works of worses that you go, like as you go further and further into the future. Correct. So if you guys, if your audience could see me, they'll see the gray hair from trying to predict cryptocurrencies. Yes, it is completely you know it. We you know the predictions. We do our short term and that allows us to mostly try and forecast whether we believe we're on an upward trend on a currency or a downward trend. Then we use that to set the order floor prices. So we believe the Bitcoin is going up, were going to try and increase our clearing prices on the exchange to just above asks so we can kind of step up everythink the reverse. Obviously we go slightly below, but that is but that's about a seven day out forecast. Two hundred and twenty seven day outs. Once you try and go further, it's it's you know, almost you know crazy town. It's this so many weird factors that drive the price and the you know, you just can't get signals around some of these. Some giant whale goes and sells a lot of bitcoin. You know the signal. You get a straight volume, but you don't get anything ahead of that. So if you do a short enough prediction, I think you can still get utility out of it, but you know we've tried to run models out further into the future and it's pretty difficult to get a good read. Pretty but that being said, like, how do you see the marriage of maybe, I guess, cryptocurrency trading or prognostication of price evolving over time? Where do you see it going? Do you see it going until I can incorporating more and more assets? To see it in trying to narrow down different sources, better machine learning techniques, better computation for like, training models faster, like I. What do you see like the like when you think about the advancement of machine learning at ai as it pertains to the work that you use it for? But what's the next like, what are you excited about? Yeah, so, so real quick on the blockchain side and then I'll switch to just pure...

...play ai son. The blockchain side, I think you know the one approach is to go after more signals. But again, I think there's some events that impact the pricing that you won't have signals to unless you some all, you were able to see into the minds of somebody that decides to drop a bunch of coin on the marketplace. So you know, I think they're it's really about the coins themselves stabilizing and becoming less fault less volatile, which they're going to have to do anyway if they want to, you know, be a quote unquote fiscal currency. On the on the pure play a eyeside, I think the sky is the limit. I mean we've we've obviously worked with net APP and Nvidia on Covid nineteen research. You Know Ai, to go back to my earlier common ai is is a suitcase term, so it means a lot. So, you know, you could you could consider ai to be anything from a simple linear regression in a spreadsheet, which a lot of companies are doing when they do sales forecast, and then you can go slightly more complicated and say, okay, deep learning and the two primary use cases there are computer vision and natural language processing. On the computer vision side, we seeing a lot of companies use that from any anything in the life sciences. We see computational pathology, where you using machine learning models computer vision to diagnose. We're seeing manufacturing defect detection, we seeing losh loss prevention. You know, the most reason we working with this company in Europe. Most recent use cases, which I thought was pretty interesting, is an apparently it hasn't hit the US yet, so I don't want to give people ideas. But in Europe they're stealing these small atms that you find in like a seven eleven, and so the way they do it they drill a little holes into these, you know, self enclosed units and then they inject some sort of gas into it and then they ignite this gas, and that's basically how you crack open these these ATM boxes. And so with these guys, this company, is doing, they're basically detecting these canisters and so whenever they see somebody in the ATM camera and they see these canners has come out, they you know, notify the police. You know, there's a very interesting use cases for pattern recognition in computer vision and then obviously you have the slew of other use cases anywhere from crowd control social distancing. On the computer vision side, natural language processing is probably the next biggest. There's a lot of innovation happening there. It's much larger models being trained. See you think about birth, gpt to and that those use cases are from a robotic process, automation, chat bots. So whenever you act with a chat butts, that's natural language processing. It uses an language model to figure out how to speak back to you. There's also, obviously the call centers. We working with a customer that is doing an enormous amount of speech to text and what they're trying to do is when you call into a call center, they want to listen to the customer conversation and say, Oh, this is a conversation about x, Y Z, and then be able to do ai models to come up recommendations for the for the call center person to say, Oh, if the customers talking about this device, then these are some of the issues we've seen before right. So really helping close that close that knowledge gap, if you will, on the call center side. So those are, you know, kind of the three big blocks. When you look at what people think of as Ai, is the you know, Cyberdyne Terminator, completely autonomous entity. I think those it's referred to as general artificial intelligence. I have not seen that for a while. That kind of will for that one and if that's what you mean with Ai, adoption then it's very low. If the other two categories is what you mean with Ai, then I think it's pretty ubiquitous and and it's, you know, it's being woven into a lot of software. Like if you look at your office suites or spring a sheets, they all now have little widgets that help you with Oh, are you trying to do x, Y or Z, or hey, we think you're trying to write a letter. Do you want to Bla Blash? So that's all effectively ai or machine learning. You know something that I think a lot of people don't quite...

...understand about that, even further than the differentiation you just made, is the fact that these models are trained and then compacted in a way and then and then they live on a client side so that they're not like you would you do but release for some of these things, like they think about like. So think to some of the text recommendation for your chat or how your series talks to you, things like this. It's a lot of the times the actual processing and recommendation is happening locally on your client and not communicating with some central server that's then doing it and sending it back to you. And that's I think a big part of the push or innovation and recent machine learning and and ai is is getting these things to be useful in such a compact size that they can be done on mobile devices and it's on and so forth. Is that? Am I interpreting that correctly? Yeah, absolutely, and and that's the other pivot of quote unquote ai machine learning. There's the model training aspect, which is the way you need the big eye end devices, GPS, lots of power, lots of data, lots of crench crunching numbers, and then they produce, you know, let's take an Imagenet, hundred and eighty gigabytes in images and it may be produces a model that's, you know, ten megabytes and that and that ten megabytes model then needs to be deployed on the inference inside or the edge side, which is where most of their real world applications happen. So you right if if it's a chat but the model was trained in one facility on one set of hardware, but when you actually chatting, you talking to the inferencing side of it, and and you're right to run that on the edge. It needs to be compacted quantities, as they say, where you reduce the precision of the weights of the model to get it to fit on smaller gear. And you know, you looking at different hardware. You looking at potentially, IFPGA accelerators versus general purpose gpus on the training side, but you're right right on that's that's another key that that AI is maturing is that it's moving away from lab projects, which is mostly training model and testing accuracy of the said models, to actually production deployments. We actually put these these models into work and those guys. It's all about inferencing, it's all about the edge. Is there something that you're most interested in? Across like it's like, like you said, the AI is a suitcase term. It's been incredibly broad spectrum of like very very deep specialities like similar like web development or data scientists or you know, what have your what other suitcase terms associated the same type of stuff? Is there something that you're specifically interested in? Yeah, you know that. I think this kind of two schools are thought. The one is that AI eventually replaces about a lot of what we do, but I think a long time before that happens, it will enhance what we do. And this could be anything from, you know, going back to my terminator references, and I'm not a Terminator Fan boy, or maybe I am, I guess. If you recall when they showed how the computer was looking at stuff, how it had all these hints show up. So imagine a world where you have wikipedia and your you know, wish bag and call and whenever you look at something it gets recognized and you can look at a car and I'll tell you the model and you can see all that. So it's this this concept of augmenting what we do, not replacing what we do. Or if a doctor is doing surgery, they have all these KPI's, almost like a heads up display, available to them. Or if you working in medicine, our ability as humans to recognize patterns is phenomenal, but the amount of data that's needed these days. You know, AI is just great at recognizing patterns. That if you look at the work that Google's doing with deep mind and other spaces where they can look at an iris scan and diagnose diseases, which you know as phenomenal that these signals exist. We just weren't able to recognize these patterns. I think ai is going to help a lot by making us as humans better, stronger, faster, and then you know they will be, I think, quite a good jobs that that AI will subsume as it gets smarter and more a general purpose. But you know, I don't know if that's in our lifetimes or our kids lifetimes. But what I do know will be in our lifetimes is ai that makes us better and makes us more efficient and helps us understand stuff deeper and more thoughtful it. Definitely see that and agree with the quite a bit, but I'm kind of seeing here that's I would call that a marriage...

...of like if I think about the current expential technologies that are in our hands, it's that's more of a marriage with machine learning, AI, with ore like that. What's an augmented reality and virtual reality exactly? That we can it's so because of that you see those two there's an obvious connection between those two technologies and how they kind of work well together. And then when you start to bring that back into what we're trying to do in the blockchain space, it it's kind of trying, I guess, preventing the Dystopian manipulation of that world once it exists. So, like say, if you to imagine a world in which you have a good portion of how you view the world and make decisions and interpret things is done through the assistance of some type of augmented reality that is taking advantage or taking advantage of a lot of machine learning algorithms that are are placed in whatever software using right, how those things get created and how they ingest data and then how they kind of feed it to you, are all kind of entry points into manipulation. Yeah, my shape or form, and so I feel as though like a potential use case for whatever we're turned do in the blockchain space is trying to give stronger guarantees around that data, whether it be what you're using to train the underlying model or so or I guess the methodology is which you like absorber or give out whatever whatever it's doing, because that sort of reasonable expect like, is that kind of what you're talking about earlier in terms of we need to have real strong guarantees about the data were using to do these things, because I think when I was talking on the Bitcoin podcasts, we were discussing the importance of price data and the feeds that you're that you're ingesting into kind of what coins at what price and historical things over time that inform you on what to then mine and make decisions on, and that's a very important thing that cannot be manipulated if you're going to make a good decision at that'll like so that's the kind of the underlying oracle issue that kind of comes up. But if we just expand that across the board, having very good data it's a very important thing, and how that data gets stored, verified and then move to the things that are actually going to use it is incredibly important. I think that's where blockchain maybe comes into play. Yeah, now, absolutely, that's that's so. If you imagine going back to that augmented reality or the the because machine learning can learn patterns so fast, you absolutely right. If you have broad application of the in of this augmented reality or ability to augment what we do, how we understand things, and all of a sudden you have a lot of people running on models that were biased one way or the other. I mean it's pretty dangerous in a way that you could have broad biases be replicating out by models that were trained on bad data and it will be so woven into what we do it'll be hard to detect. You know, one of my favorite books to plug is the thinking fast and slow, and it talks about human minds ability to almost not be able to distinguish truth as long as it's ubiquitous. So basically, if doesn't matter how crazy and idea is. If you have enough social affirmation of that concept, you would experience it as full and truth. And so go take that with bias models and you really create this short circuit where you know you could have really things go really haywire. And so, because of that, I think models are going to become almost like crypto keys or stuff that get managed and trained by groups that have made sure that the models, that the classes, are balanced within the models. And if you do have models that are doing inferencing on on live data, there's a lot of technologies that look it's called data drift, where you just making sure that your incoming data sets aren't varying so much from this stuff that was trained on that you're going to start making pretty much random, random prediction. So yeah, it's garbage in, garbage out, on steroids with with with machine learning, so you have to be very careful. That's interesting. So you mentioned the sharing of models and models would be managed by a group of people. It is. They're already examples of this happening, like open sourcing AI models. Yeah, yeah, already. Yeah, absolutely.

So most of the big models are either trained by academia or some of the large company. So if you look at you know, the first you know, major natural language processing model, lm. That was trained by the Allen Institute, Bird was trained by Google Gpt, and Microsoft did a new one, the transformer model. That's the largest model. So and these models are shared and then what most practitioners do is they they then go ahead and Shune these models. Very few companies actually take or Imagenet, which is a famous computer vision model. Very few companies start from scratch. They usually take one of these models and what's called tune it, fine tune it, where you saying, okay, this model knows how to detect visual objects, but maybe it doesn't know how to detect in a boat, for example. Just making this up, and so you can then tweak that model slightly to be able to learn to detect boats. And so yes, the answer your question, a lot of the bigger models for, you know, power and consumption and cost reasons, are done by a few companies or academia, but then they are used by a lot more, you know, general purpose folks out in the field. And so in a way that's already happening that you can imagine that that's just going to get more and more and more ubiquitous as new types of models, new patterns need to be recognized, comes along. It's off the wall. Like five years ago I published a paper in quantum computing about when a quantum computing but quantum dynamics calculations, working with some modeling quantum mechanics on classical computers and I had adapted neural networks to be used as a computationally efficient interpolation system. So, like, I guess excess to give a bit quick prefix, there's these incredibly large, complex multivariate services that are required for these types of calculations on supercomputers, right, and every time you evaluate one of these functions that has a lot of variables in it, it takes a significant amount of time in order to do the calculations. These things seem to be these functions need to be evaluated millions of times, right, right, and they become somewhat of a limiting factor and how long it takes you to get the job complete on the super computers, which is incredibly expensive. And the goal of the paper was to use machine learning, specifically neural networks, to create a sort of working for basically a computational efficient function that gets evaluated very, very quickly based on the large the large surfaces. Is that something you see in the ecosystem and in other ways, like using neural networks as interpolation engines as opposed to trying to prognosticate based on some Corpus of data? Yeah, I mean I think it's tractable that we get more adoption. I think if you look at at the state of some of the model architecture as I mean, we're still seeing new model architectures come out and so I think we've far from having dialed in exactly what the best activation function is or what the best architecture is and should you do recurrent or convolutional, recombination or in a so I think there's a lot as we can still learn and then look a lot of new applications for neural networks. The I think as we as we get more real world applications, I think it's going to require different ways of thinking of models. You know, the one thing that was quite interesting, speaking of an out of the box use case, was this rice university did a hash based implementation of deep neural nets called slide slide, and they basically deviated from Jeffrey Hinton's back propagation approach to figure out the weights of a neural net and and so that was a completely different approach. It doesn't require gpus, it doesn't do a lot of matrix multiplication. Instead it does a lot of Hash table lookups. and You I can imagine that with the innovation in hardware and in new network architectures that you know, things are going to get. You know we will be in five years time we'll be looking back...

...at some of these models and go, how, remember when we used to use IMAGENET or confinets? How you know how primitive those were? This is what we doing today. So yeah, I think there's definitely going to be new use cases that that come out. That's about enough. I could probably talk about this for a while, but without diving you're going down too many tangents. Is there? Are there, then, questions that you wished I would have asked that I didn't. Now I think you guys did the great I mean, I'm really excited with what the AI space. I think it is a transformational technology. It's Andrewing said it's like electricity and I believe that it's going to be transformation. It's going to be in everything we do in one shape or form. It's going to be very powerful, which means you could be used for good and bad. So we're going to have to keep tabs on it, just like you do with any technology, and I think it's going to it's going to help us some solve some challenging problems it. You know, if you if you look at the pace at which ai can learn versus the pace at which we've learned. I actually wrote an article, a blog post on this, but you know, if you look at how evolution allows you to learn and how you could only learn through lineage, and now we then we started writing, and so you could share stuff between folks and other people and learn and have memories of Oh, you know, don't stick your hand and alliance mouth because it'll bite it. Okay, that becomes something that you've learned from prival and error. But with what deep learning, we can continuously learn, and so you know that that in and of itself is amazing. Of what we'll be able to do by having you just continuously learning, learning, learning, observing, recognizing new patterns. So I'm really excited about AI, which is why I work in this field. I'm excited what role core plays and and and powering data scientists to solve these problems. And so yeah, I mean I think you guys did a good job. Is there anything else you'd want to ask? Who want me to dig in more for your audience doing? I mean, I there's so many stupid questions I could ask about ai because I'm genuinely interested in the field but don't know much about it. And what then, even stupid audience member and ask those questions. Okay, so let's start with the with the one which is like recent, gptthree. Right, what's the big deal? So when you get to these language models, it's, you know, really it's called embedding word embeddings. It's trying to learn the relationship between things in an almost multidimensional graph. And so typically, whether it's bird, gpt two, gpt three, Elma, the issues there is just on how many parameters you've trained. If you look at the brain, I think it's a hundred billion synapses. So think of those as parameters. And so gpt three is just a very, very large model that was trained, and so the conjecture is the more parameters, the more complexion model is, the more advance it could be in detecting and this cause language models. Yeah, so that that's basically the thesis around around these natural language processing models. So bigger is better? Is gpt three the most advanced one right now? I think GP I'm not. Actually I'm not sure whether there's one that Microsoft recently did, t something. I can't remember the acronym, but they that are transformer based model, and I'm not sure if g PTTHREE is actually larger than that. Okay, that you could just look up the number of parameters. Like I said, the reason this is limited to these big companies is really who has access to that much computational power? Yeah, you'd be able to train these models, but yeah, it's a toss up between GPD three and TPA or something. Can't remember what Microsofts One's called. Then I'm kind of surprised that like the model gpt three came out of open ai write, which is like it's the Elon must company, I think right correct. Yes, I'm kind of surprised that stuff like that, like is Google doing open source stuff like this, because you think they have the both the computational power and the data sets required to get something equally, if not better, off the ground. Yeah, now, they...

...they did. I think gpt two was done by google. I'm not mistaken. Okay, yeah, they so, you know, the like the language models. They all have funny names. ELMO, bird, gpt to gpd three and then Micros, Microsoft one, which, for the life of me I can't remember. So yeah, they they do come out of either academia or or a non nonprofits. Open AI, believe, trained their model on Microsoft, as are infrastruct share, because Microsoft invested a good portion of infrastructure into open Ai. But yeah, these, these are great, great for natural language processing and it's funny, we've kind of stopped innovating much on computer vision. I think people feel that that's at a reasonable place, and now everybody is innovating on natural language processing models and you know, I think what will happen is we will learn about new approaches that we might circle back and go play on computer vision again. And so, yeah, I think this is going to continue to get were going to continue with new architectures and and finding ways to shrink them down to run an edge devices. To that end, it seems really important to have people like you at infrastructure companies trying to figure out like what power computation devices, architectures do we have available to trade these various things and where they most efficient? Like, like you said, like people can't do these things unless they have access to a specific amount of resources to do it and or or if there's resources, are even available broadcastings that people who would like to do them. And without writing someone like you and your position, it's really hard to make to bridge that gap. Are you seeing your competitors, other people who provide these resources? Higher people who are interested in machine learning, researchers? It just part of the deal. Like you don't run a company like that unless you have someone like you involved. Well, to the first part of your statement, absolutely true. You know the example that comes to mind, as we did initiative with MIT in. One of their researchers trained what's called the Big Gan generative adversarial network on our infrastructure and that was the first time somebody trained a network outside of Google. So absolutely our goal is to help bring this capability to to all data scientists and and yeah, we want it to be democratized to make sure that everybody can can have access to this infrastructure in one shape or form. It is actually much harder to manage than you then traditional, you know, Dell servers, example, and so there is an expertise around managing that, around sharing them, making use of them. The the worst cardinal sin you could do is is by these expensive infrastructure machines, these servers, and then not be able to utilize them and have the you know, idol gpus or fifteen percent utilization, especially since you paid such a fifty price for them. So we make sure that people can utilize their infrastructure the best possible way. It makes all sense. That reminds the since me back into almost PTSD, of handling job queues on the SUPERC for scientific applications. It's like there's so much power not being used right now. What's going on right exactly? It's brutal. You know the this this researcher made a comment in the the challenge of courses. Unless you have fully accelerated fabric right, everything from storage to networking to the you know, whether you use envy link and re switch. Your pcie expressed that whole pipeline of getting data onto that silicone onder GPU has to be accelerated or you're going to end up with under utilized infrastructure. And you know, do you have ten percent under utilized GPU. That adds up a lot. And so with our with our software called PLEXUS, when you run a workload, it will make a recommendation to you to make sure you're not going to run on Gpus that are going to be half utilized and you're going to pay an arm and a leg for on a public clob provider, for example. It will it all right, size the infrastructure for you to make sure that you not overpaying. And or if you run a job it'll tell you, listen, you have ample CPU cycles to spare. You know, maybe consider moving some of the compute on to the CPU. So just, you know, to the Meta point of how do you maximize your investment and...

...make sure you don't have I door cycles, because these these you know, as amazing as they are, in vidia releases in new GPU architecture every two years and so you need to make sure you get the most out of that. And that's you know, that's something that the the mining folks know a lot of you know when I started with core, as nines will all the rage, and now you know as nines are struggling, and so you know you have to maximize the value you get out of this infrastructure in the two year window of which it's, you know, considered state of the art. And that's another thing that we see an overlap between blockchain and aim. That's really interesting. I could probably whole other episode in itself is trying to discuss the the linkage between hardware architectures and the software that run on them and then how you figure out right, how to maximize that. Yeah, because it does. It's not like your PC. I mean, I think my pcs well, I just got a new one, but the one I had before that was, you know, eight years old. Eight year old GPU is antique and if you're old, mining is the thesist alost right, so that it's a whole. It's, you know, it's make the most while you have it kind of game. All right. Do you get everything else here? No, all right, and I really appreciate you coming on a kind of chat with us about this stuff. It's it's quite fascinating a day on this side of the conversation, which you don't hear much of. So thanks. Thanks for coming on. Maybe we can have you back to dive into more stuff. Sounds Great. Thanks everybody. You.

In-Stream Audio Search

NEW

Search across all episodes within this podcast

Episodes (119)