Rivian Eventual L3 and Data Gathering

ajdelange · Feb 9, 2020

EVian said:
To have one amorphous system that perceives the world and directly controls steering and brakes would be insanity, but I don’t think that’s what you’re suggesting.That would not pass a Functional Safety Assessment.

It's not that I am suggesting it. The Tesla cars are exactly that and they are considered the safest cars on the road. I'm not sure that what is meant by "amorphous" here. They are hardly that but there are loops within loops within loops if you get down to the motor/inverter controls.

EVian said:
Do we agree then that the perception/classification and control aspects are separate parts or functions within the system?

Of course but they can't be treated independently (you may have guessed I was a systems engineer). The job of a control system is to estimate the state of the system, compute the cost of being in other than a desired state and come up with a set of inputs to the system to move it towards the region in phase space where the costs are low (minimum). Clearly there are separate parts to this and some decoupling is possible. The Tesla FSD engine contains some Wiener filters (deconvolutions) and these need to be trained adaptively to minimize the error in estimated state and actual state. The optimum filters depend on the sensors and environment - not on how the cost functions are applied to the estimates made by these filters so the adaptation algorithms for them can be developed independently of the servo algorithms to some extent. But that dependence on the sensor suite cannot be neglected in terms of OP's question. If you are starting with a clean sheet of paper you might well start with algorithms that take sensor geometry, precision and noise model as parameters because you haven't any idea at this point what your sensor suite is going to be. Such algorithms ought to be able to work with any sensor suite and so data from Tesla could be used to check that your algorithms are basically working. Thus I agree that one might use data from someone else to evolve a classifier/estimator. But one must proceed with caution. For example required word size depends on the condition number of the covariance matrices being inverted (Wiener filter) and if your sensor suite turns out to deliver covariance matrices that are more singular than Teslas you may be in for a big surprise (this happened on a program I worked on - it was not a happy time). Practically speaking one really must train with data derived from his own system or one known to be very like it.

EVian said:
So for the perception part it can learn from data, either real or simulated, that is played to it, rather than being sensed in the real world. The perception part only needs to work out what it perceives and then compare that with the reference answer for the data played to it.

As noted above the data from the real world depends very much on the sensor suite. The data from 3 surgeons in the road is very different from 2 cameras, 2 sonic sensors and 2 Lidars than it is from 4 cameras, 3 sonic sensors and a radar.

EVian said:
Feedback for the control part is whether the vehicle is where it is supposed to be and that its movement is within certain parameters; rate of steering, g-forces etc.

Feedback for the control part is the difference between the estimated state vector and the minimum cost state vector at the autopilot subsystem level. IOW d and q aren't elements of the autopilot state vector but velocity and position would be.

EVian said:
‘Where the vehicle is supposed to be’ is as determined by the perception/classification part.

Not sure whether "supposed to be" means where it ought to be or where the classification system supposes it to actually be. The latter is correct if we understand that we are talking about phase space - not Euclidean space.

EVian said:
If the vehicle is doing something it shouldn’t as determined by the user and that is because the perception is wrong then there is no feedback from the system itself.

The feedback is still there. It is still the difference between a low cost state vector and the estimate of the current state vector (which in this case does not represent the true state) If the system perceives that there is a surgeon in the road when there isn't it thinks that the car is in a high cost state and and slams on the brakes to get out of it. That's not good of course but the control part of the system acted as it should given the information it got from the estimator. This happens fairly frequently with the current Tesla autopilot, BTW.

EVian said:
Feedback comes from the user (they take control), and that’s something the fleet will ultimately learn from, but not the individual vehicle right there and then (cf. the Tesla Autonomy Day video).

I mentioned above loops within loop. The outermost loop is the driver. If he fears that he's going to get rear ended because the car has slammed on the brakes in response to a false alarm he'll stomp the accelerator to over ride the autopilot. Now there is a loop outside the driver loop too. If the car senses that autopilot was over ridden during sudden braking it may well decide to forward the current state estimate and sensor data to to the mother ship for analysis. I don't think this data would be of much use to Rivian except in the broadest sense. Now I'll note that this is a little confusing as "user feedback" is not the thing that a controls engineer thinks of when he sees "feedabck"

EVian said:
So for the perception part it can learn from data, either real or simulated, that is played to it, rather than being sensed in the real world.

Well no, not unless the data comes from a sensor suite that is identical to that of the estimator being developed as discussed above or can be transformed to have essentially the same characteristics (e.g. a facial recognition system using low resolution cameras could be trained from a library of high resolution images.)

EVian said:
The perception part only needs to work out what it perceives and then compare that with the reference answer for the data played to it.

But it can't, for example, be trained to recognize dog breeds based on weight from a library of photographs of dogs.

In an attempt to wrap it up: Your ideas can generally be responded to with phrases like "well maybe under some conditions", "well OK if", "possibly if you first" and so on but clearly the simplest and most robust path for Rivian is to collect data from their own sensor suite and use it to develop their state estimators and to collect vehicle dynamics data from their development models and merge these into one integrated system that perceives the world and directly controls steering and brakes and motors. Clearly they will have had plenty of time to do this by the time they start shipping and are probably out there collecting data as I write this.

EVian · Feb 9, 2020

ajdelange said:
In an attempt to wrap it up: Your ideas can generally be responded to with phrases like "well maybe under some conditions", "well OK if", "possibly if you first" and so on but clearly the simplest and most robust path for Rivian is to collect data from their own sensor suite and use it to develop their state estimators and to collect vehicle dynamics data from their development models and merge these into one integrated system that perceives the world and directly controls steering and brakes and motors. Clearly they will have had plenty of time to do this by the time they start shipping and are probably out there collecting data as I write this.

Simplest and most robust, yes. But that wasn’t the question. The question was is it possible to learn more quickly than having a fleet driving a billion miles, and the caveated answer is yes, not the “no” you first gave. Rivian will be doing as you say, but they can also be taking advantage of other data. Full stop.

thrill · Feb 9, 2020

Based on the conversations here, I thought the following post I got today might be of interest. To simplify, more data is better if your model needs more data - but there are other approaches to building a model. https://olivercameron.substack.com/p/the-more-data-the-better-right

ajdelange · Feb 10, 2020

Yes, interesting link. And the Tesla seminar link is fascinating too. The former reinforces what I said about the steepness of the learning curve in an earlier post. I don't want to get too technical here but the accuracy of an estimate generally depends on the square root of the number of measurements used to construct it. Thus if we assume Tesla has billions of measurement and Rivian millions then Tesla is clearly at an advantage but the advantage isn't by a factor of 1000, it is only 31.6. If Tesla's accuracy is 6 nines (right 99.9999% of the time - which it isn't) it is wrong 0.00001% of the time. Rivian, with 1/1000th of the data, and ceteris paribus, would then be wrong 0.00032% of the time and its accuracy 4 and a half nines or 99.997% (any statisticians, mathematicians or information theorists reading this: please forgive me).

There are other reasons that the addition of tons of additional data yields diminishing returns but I can't think of how to describe them in other than very technical terms beyond that more "corner cases" get introduced and that complicates the decision making. But it is the ability to handle those corner cases that gets us the additional nines and that is what we are really after. I expect there is lots of discussion as to just how many nines will be required for level 5 on public streets.

Thus the answer to the question as to whether Tesla is at an advantage or not is "yes" but to the broader question of whether it is so far ahead of Rivian that one ought to wait for the CT is "no". Or to the question "Can Rivian catch up" the answer is "no, but it doesn't matter". At least not for me. Four and a half nines is (not that this is the real number) plenty for me. You might interpret it as meaning intervention is required 3 times in 100,000 miles driven on autopilot.

ajdelange · Feb 10, 2020

I don't even know if I should post this but I fancied that I might be able to explain why adding data doesn't always improve performance and why you can't use data from one type of sensor to train a system which uses another in mathematical terms without getting into the hairy details. The basic mathematics has very wide application. I have used it professionally in communication systems, antenna pointing and characterization and to discipline atomic clocks. In the hobby world I have used it to characterize beer color (though that work resulted in a paper in a peer reviewed journal and a chapter in a textbook I didn't get paid for either).

Consider dogs. Let's say we have 1000 pictures of dogs taken with a black and white 1 Mpixel B&W camera. Each picture can be represented as 1 million brightness numbers. Starting at the upper left hand corner arrange them in a row. That row of numbers represents this camera's view of a particular dog. Now do the same with the numbers from the picture of the second dog and the third until you have done all 1000 pictures. You have now a huge (1000000 x 1000) arrary of numbers. This matrix contains information about various dogs and how they differ from one another and it also contains information about the background(s) against which the dogs were photographed and it also contains information about the flaws in the camera e.g. its thermal noise, quantizing noise, bokeh, halo, aberration, resolution... Thus that matrix represents this group of 1000 dogs as seen by this camera. Suppose instead of photographs we has sets of numbers representing height at shoulder, height at withers, tail length, overall tail length etc. (sets of doggy Bertillon measurements). The data now represents variation in dogs as represented by Bertillon measurements. And if the data were sets of numbers from a LIDAR the matrix would represent this ensemble of dogs as represented by a LIDAR scan. So far this should be simple enough. If we are now given a another photograph of a dog (or another set of Bertillon measurements or another LIDAR scan) we ought to be able to determine some things about the dog in the photo or scan by comparing the numbers from the new sample to those in the matrix. It already be pretty clear at this point that we can't take a new list of Bertillon measurements and compare it to the photo or LIDAR matrices but we are really more interested at this point in why additional data is only marginally helpful and the reason for that lies in how we do the comparison. We first do some magic called Singular Value Decomposition on the matrix. This results in a million new columns of numbers each containing a million points. These columns are called "vectors" (a general term for a list of numbers). in particular they are called "eigenvectors" because they represent the very essence of dogginess. Each represents a feature of dogs and the magic relates to the fact that they are given to us in order of their power to represent a dog or, in particular a dog as seen by this camera. If you were to rearrange the numbers back into a raster and look at the raster you would see things that look like dogs and by combining these in various proportions you can create a picture of a dog. If you take a picture of a Leonberger from the original set (which is, clearly, the training set in the broader context) and multiply/accumulate each of its points with each of the eigenvectors (called "dotting") you would get a set of numbers (coefficients) which are characteristic of this training set picture of a Leonberger. If you did the same for a Chihuahua you would get a different set of numbers which would be quite different. The first coefficients (those from the low numbered eigenvectors which represent the features that most distinguish dogs) are large so consider plotting the first two coefficients on a graph. The point represented by these for a Leonberger will be quite distant from the point for the first two coefficients for a Chihuahua so it is easy to make a decision as to whether the test picture represents a Leonberger, a Chihuahua or something else. To do this we determine the coefficients for all the Leoberger pictures in the training set and plot them. They should cluster. We then do the same for all the Chihuahua pictures in the training set. They should cluster at some distance from the Leonberger cluster. We then draw a circle (or ellipse) around each cluster. To tell what breed a dog in a test picture is we dot the pixel numbers with the first to eigenvectors and plot them. If they are in the Leonberger circle we say the test dog is a Leonberger. If it lies within the Chihuahua circle we declare him a Chihuahua. If the point falls within neither circle or ellipse we say he is neither. As part of the training involves locating the region for each of the breeds of interest or at least all those in the training set there will be lots of other ellipses on our plot.

The reason more and more data doesn't necessarily improve performance is that as we introduce more and more data we introduce more and more cases that aren't well represented by the original training set. Consider a photo of an Estrela. A Leonberger owner shown one would doubtless call it a Leo. IOW it looks a lot like one. If you mollify offended members of the Estrela owners club by adding photos of Estrelas to the training set you, of course, get a different set of eigenvalues but the first two coefficients from Estrelas are going to be very close to those for Leonbergers. To distinguish them you are going to have to consider minor features and that means going way down the eigenvector list. The decision space is now multidimensional. If the feature(s) that separate(s) Leonbergers from Estrela's is subtle enough it will be of magnitude comparable to the features of the noises and distortions. In this case you would perhaps want to consider a sensor which separates Estrelas better such as a color camera. This, of course, much further complicates everything but here we only observe that this means a different set of eigenvalues and a different set of coefficients.

Each eigenvector has a certain amount of power to distinguish. That power is called its eigenvalue. Just adding more data to the training set has the potential to redistribute the eigenvalues in such a way that more coefficients may be required to do the job as well as can be done with less. But we may have to do this to accommodate edge cases.

FuriousFun · Feb 12, 2020

Do we know if the self driving computer and or the media control computers can be easily upgraded if better parts become available?

ajdelange · Feb 12, 2020

The computer is a brand new in house design that runs at TOPS (terra ops per second) rates None the less a new computer will come along soon. More processing power only gets you so for. Basically the Perceptron is, from what I can see, little changed from what I was taught about in school 50 years ago except that it runs TOPS instead of MOPS.

The software in Rivians vehicles will be upgradable over the air. This means that bugs can be squashed and new features added without upgrading hardware. Whether hardware itself will be upgradable depends on many factors. I have no insight whatsoever.

[Edit] As I have done before I forgot where I was in the first paragraph and described the new Tesla self-driving computer. I have no idea what Rivian is using for its engine. I don't think they have designed their own as Tesla did.

skyote · Feb 13, 2020

FuriousFun said:
Do we know if the self driving computer and or the media control computers can be easily upgraded if better parts become available?

I asked that question about HW upgradeability for autonomy. The answer I received is that the sensors would be the gating factor, lidar in particular. More processing is distributed to the sensors themselves, and those would be harder to swap than a central driving computer.

ajdelange · Feb 13, 2020

The one thing we can be sure of is that the sensors, the computer hardware and the software/firmware will evolve so that next year's truck will be better than last year's. How this evolution takes place remains to be seen and whether retrofit of hardware will be possible or offered also remains to be seen. Often manufacturers do not make it possible to retrofit in order to push consumers into buying the latest model.

electruck · Feb 13, 2020

ajdelange said:
The one thing we can be sure of is that the sensors, the computer hardware and the software/firmware will evolve so that next year's truck will be better than last year's. How this evolution takes place remains to be seen and whether retrofit of hardware will be possible or offered also remains to be seen. Often manufacturers do not make it possible to retrofit in order to push consumers into buying the latest model.

It will be interesting to see if this strategy starts to change. From a consumer perspective, it's not quite such a big deal to spend $500 every year to get the latest smart phone tech (yeah, right, remember back when smart phones were only $500...). When you're talking about $50k-$100k vehicles people expect them to last 10+ years and often aren't willing to spend big bucks just to upgrade. But with battery and AD technology evolving so rapidly, these expensive vehicles are either going to become essentially disposable and lose resale value far too quickly for the comfort of most... or manufacturers are going to have to start designing with future upgrades in mind. Yes, that brings engineering challenges.

Musk keeps talking about the depreciation of ICE vehicles relative to BEV but I'm curious how he rationalizes his claims given that tomorrow's tech always obsoletes today's tech and the tech cycles are getting increasingly shorter relative to an automobile's overall development cycle.

electruck · Feb 14, 2020

According to this guy, Rivian told him at FCL that they will ship with L2 and collect driver data for at least a year before attempting L3.

thrill · Feb 14, 2020

electruck said:
According to this guy, Rivian told him at FCL that they will ship with L2 and collect driver data for at least a year before attempting L3.

His explanation of L2 vs L3 is incorrect when he states that L2 does not allow automated lane changes, etc. L2 requires the driver to pay attention to the driving task because he still needs to determine if he should take over. L3 require the driver to be available to take over, but doesn't need to pay attention if everything is going well, and the machine is sufficient to decide if it needs to hand off control. Any capabilities at L2 that require the driver to pay constant attention is still L2, no matter how fancy it is, such as lane changing, merging, exiting, etc.

Some people call anything above single lane maintenance and adaptive cruise control L2+, but that's not how the SAE defines it.

EVian · Feb 15, 2020

thrill said:
His explanation of L2 vs L3 is incorrect when he states that L2 does not allow automated lane changes, etc. L2 requires the driver to pay attention to the driving task because he still needs to determine if he should take over. L3 require the driver to be available to take over, but doesn't need to pay attention if everything is going well, and the machine is sufficient to decide if it needs to hand off control. Any capabilities at L2 that require the driver to pay constant attention is still L2, no matter how fancy it is, such as lane changing, merging, exiting, etc.

Some people call anything above single lane maintenance and adaptive cruise control L2+, but that's not how the SAE defines it.

Agreed, except that for Level 3, the requirement for the driver to be able to take over essentially means you do need to pay attention. Yes, you shouldn’t need to make the decision to take control, but if the vehicle asks you to take control you will need to do so pretty quickly. If the driver isn’t paying attention, then that handoff isn’t guaranteed to be safe.

EVian · Feb 15, 2020

And actually that reminds me of the _only_ thing I’ve heard RJ say that I didn’t _completely_ buy in to. I think it was the Sean Mitchell interview in the cable car, where RJ said something about L3 (I think) and said ‘you’ll be able to read a book’. For me that’s L4.

The most memorable way of remembering the levels of automation I’ve seen is L3 is hands off, L4 is eyes off, L5 is brain off. Although that doesn’t quite tally with the SAE definition.

And that’s the biggest problem with autonomous vehicles. Perception of the public about capabilities. The guy that got decapitated in a Tesla; not Tesla’s fault. He thought he could not pay attention (in a Level 2 vehicle) and paid the price. There is a gap between marketing hype and technical reality that must be addressed if autonomous vehicles are going to be accepted, I believe.

electruck · Feb 15, 2020

EVian said:
And that’s the biggest problem with autonomous vehicles. Perception of the public about capabilities. The guy that got decapitated in a Tesla; not Tesla’s fault. He thought he could not pay attention (in a Level 2 vehicle) and paid the price. There is a gap between marketing hype and technical reality that must be addressed if autonomous vehicles are going to be accepted, I believe.

I strongly agree with this statement, as do others:

https://www.cnbc.com/2020/01/24/sen...t-feature-because-it-can-confuse-drivers.html

If we can at least get people to understand and respect the limitations of their L2 and L3 vehicles, we'll all stand a better chance of surviving this period of teenage AI drivers.

Rivian Eventual L3 and Data Gathering

ajdelange

Well-Known Member

EVian

Well-Known Member

thrill

Well-Known Member

ajdelange

Well-Known Member

ajdelange

Well-Known Member

FuriousFun

New Member

ajdelange

Well-Known Member

skyote

Well-Known Member

ajdelange

Well-Known Member

electruck

Well-Known Member

electruck

Well-Known Member

thrill

Well-Known Member

EVian

Well-Known Member

EVian

Well-Known Member

electruck

Well-Known Member

Similar threads