






















A Chinese commercial vehicle speaks two protocols to its monitoring platform, JT 808 and JT 1078. They are easy to read as two separate standards a buyer picks between. They are two layers of one stack: JT 808 is the base that carries position and alarms; JT 1078 is a video layer built on top of it. The clearest way to understand them is to see how the second was added to the first.
The two appear side by side on every vehicle spec sheet, which invites a buyer to treat them as a matched pair of equals. The relationship underneath is the one between a building’s foundation and a floor raised on it. JT 808 came first and does the groundwork: it carries where a vehicle is, what alarms it raises, and how it talks to the platform at all. JT 1078 came later for a job JT 808 was never meant to do, the carrying of camera video; with no starting over, it was built onto the base that already worked. The base text has run two main generations, JT/T 808-2011 and the current JT/T 808-2019, with the video extension joining in 2016. The pairing has stayed stable since, which is why a 2016-era video spec still bolts onto a 2019 base.
Seeing the two this way clears up the two errors a buyer can make about them. One is to take them as the same thing, so that a device doing position is assumed to do video. The other is to take them as rivals, two standards to choose between. The truth sits between: a base and an extension, made to run as one stack, each carrying the part of the work it was written for. The build order explains every property the stack shows today, from what the base lays down to how the video floor sits on it.

JT 808 is the groundwork of a vehicle’s link to its platform. It defines how a terminal registers with the platform, proves who it is and keeps a connection alive. Those steps have numbers of their own: registration is message 0x0100, authentication 0x0102, the heartbeat 0x0002, the routine position report 0x0200, which packs alarm flags, status, latitude, longitude, altitude, speed, heading and time into one frame, sent on the cadence the platform sets. A JT 1078 video command arriving at a terminal is one more message family in the same numbered conversation. Over that connection it carries the vehicle’s position from the satellite receiver, the alarms the vehicle raises for speeding or fatigue or a fault and the routine messages that pass between vehicle and centre. This is the core of what early commercial-vehicle monitoring in China needed: knowing where a vehicle was and being warned when something went wrong. For years that was the whole job. A fleet platform tracked its vehicles on a map, took their alarms and sent the occasional instruction back, all over JT 808. The standard settled how a vehicle and a platform find each other, authenticate, then exchange these messages in a form any compliant platform can read. It is a complete protocol for position and alarms on its own, with no mention of video, since video was not part of what the system was asked to do back then.
The shape of JT 808 is a set of messages a vehicle and platform pass back and forth. A location report, 0x0200 in the standard’s own numbering, packs alarm flags, status bits, latitude, longitude, altitude, speed, heading and a timestamp into one fixed block. A registration message brings the vehicle onto the platform. An alarm message flags an event the moment it happens. Each has a defined format, so any platform built to JT 808 reads a message from any compliant terminal. This common language is what lets a fleet mix terminals from different makers on one platform. JT 808 has grown through versions of its own over the years, tightening its messages and adding fields as commercial-vehicle monitoring matured. Each version stayed compatible with the ones before, so a platform and a terminal built to the standard could still understand each other. This steadiness is part of why JT 808 could serve as a base. A foundation that shifted under each revision would have been no ground to build a video layer on; one that held its shape gave JT 1078 something firm to rest on.

Camera video came to commercial vehicles as a separate demand, after the positioning base was already in place. Fleets and regulators wanted to see what happened on a vehicle, the road ahead in a dispute, the loading of goods, the driver at the wheel. A position on a map and an alarm code could not show any of that. The need was for live and recorded camera footage, reaching the platform alongside the position data the vehicle already sent.
This new demand could have been met with a fresh standard, a separate protocol for video with its own way of connecting and authenticating. That path would have meant a second independent link from every vehicle, duplicating the registration and the session management JT 808 already did. The designers took the other path. They treated video as an addition to the working base, defining it as an extension that uses what JT 808 had already settled and adds only the parts video needs.
The cases that called for video were the ones a position could not settle. A dispute over a collision needed the road ahead, which a dot on a map could not give. A claim of mishandled goods needed the loading bay. A worry about a driver dozing needed the cab. Each of these needed a view of the vehicle, a record of what happened that an alarm code could never carry. The pressure for video grew as fleets and regulators found the limits of position alone.
The value video added went beyond settling disputes. A fleet that could see its vehicles managed them better, spotting a driver’s bad habit before it caused harm, confirming a delivery without a phone call, watching a loading bay for damage. The cameras turned the platform from a tracker of dots into a view of the work itself. This wider gain is what carried video from a rare extra to a standard part of a commercial vehicle’s fit.
Regulation pushed video the same way fleet need did. Rules for buses, for goods vehicles carrying certain loads, for vehicles of a size or use the authorities watched, came to call for cameras and the reporting of their footage. A vehicle in those classes had to carry video to be road-legal, the JT 1078 layer now a requirement it met, where once it had been an option a fleet weighed. Need and rule together made video a standard part of the fit.
Video commands ride the same registered session the position reports use.
This is the core of the relationship: JT 1078 reuses the base rather than rebuilding it. A vehicle running video still registers and authenticates through JT 808, still keeps its connection alive the JT 808 way, still sends its position and alarms over the JT 808 layer. JT 1078 does not touch any of that. What it adds is only what video requires, the definition of camera channels, the commands to start and stop a live stream, the means to find and play back a recording, the handling of two-way audio.
The gain from building this way is that video slots into a vehicle’s existing link, with no second link to stand up. A terminal already talking JT 808 to the platform gains video by adding the JT 1078 layer on the same connection, the same registration and session carrying the new traffic. A platform that already manages a fleet’s position and alarms manages their video through the same channel it already holds open. The extension leans on the foundation at every point, which is what makes it an extension; a second free-standing standard would have rebuilt all of it.
What JT 1078 borrows from the base can be named in full. It borrows the registration that brings a vehicle onto the platform. It borrows the authentication that proves the vehicle is what it claims. It borrows the session that holds the connection open, the heartbeat that keeps it alive, the message framework that wraps every exchange. On all of this JT 1078 adds nothing of its own, taking the working machinery of the base as the ground its video stands on.
Building in layers this way carries a practical good beyond saving a connection. A maker who already builds JT 808 terminals reaches video by adding the JT 1078 layer, working from a base it knows. A platform vendor extends a tracking platform into video the same way. The industry grew its video capability on top of a positioning capability it already had, the layering letting each side add video without throwing away the positioning work behind it.
The layering is a way of building that runs through all sound engineering, seen far beyond these two standards. A working system that has to grow faces a choice. It can tear down what it has and raise a bigger thing from scratch, paying for the rebuild and risking everything that already worked. Or it can leave the working part in place and add the new capability on top, as a layer that uses what lies below it. The second path is the one JT 1078 took. The positioning base was sound, proven across years of fleets that depended on it; a video standard that started over would have thrown that soundness away for no gain. Building video as a layer kept every bit of the base’s value and added only what was missing. The vehicle keeps its proven way of registering, authenticating and reporting position; the video rides on that, a new floor on a foundation that holds. This is why a buyer does well to resist seeing the two as rival choices. A rivalry would mean picking one over the other. The two stack, with no choice to make between them. The base does the job it always did, the position and the alarms every commercial vehicle has to report. The extension does the job the base was never built for, the carrying of camera video. A vehicle with cameras runs both because it needs both, the floor and the foundation each doing its own work in one structure. Read the relationship as a stack and the spec sheet stops being a menu of two options and becomes the description of one system in two layers, the lower carrying position, the upper carrying video, the whole reaching the platform as one coordinated report from one vehicle on one link. The stack is one design split across two standards, the base laid first and the video raised on it, kept apart in name only so that each can be specified and built on its own, the two running as one on the road. To call them two standards is right on paper and misleading in practice, since on any working vehicle they behave as two layers of one structure, the base running inside every report the vehicle sends. A platform engineer sees the same split in the logs. The 808 messages arrive in a steady stream from every vehicle. The 1078 traffic appears only around the sessions an operator opened.
One detail shows the building-on at work. JT 808 carries the vehicle’s alarms in a status field of a fixed size, thirty-two bits, each bit flagging a kind of alarm; that set covered the alarms a position-and-alarm system raised. Video brought alarms of its own, a blocked camera, a recording that failed, a storage fault, kinds the original field had no room left to express.
JT 1078 answered by widening the field to sixty-four bits, the extra room holding the video-related alarms on top of the original set. The lower bits stay as JT 808 defined them, so an older platform still reads the alarms it knows; the upper bits carry the new video alarms a JT 1078 platform understands. The widened field is a small, exact picture of the whole design: the original kept intact, the new built on its edge, the two readable together by anything that speaks the full stack.
The video alarms the wider field holds are particular to cameras. A lens blocked or sprayed over loses its view, an event a fleet wants flagged. A camera that drops offline stops recording. A storage card that fills or fails breaks the record. None of these meant anything to a position-and-alarm system, so JT 808 had no bits for them. The widened field gives each its own flag, carried beside the original alarms on the one status word a platform reads.
Video forced one thing the base never had to handle, the sheer volume of a media stream. JT 808 messages are small, a position report or an alarm flag, light enough to share one steady link with everything else. A video stream is heavy by comparison, far too large to push through the same channel that carries the control messages without overloading it.
JT 1078 separates the two. The signalling that controls video, the command to open a stream or seek a recording, travels over the established link beside the JT 808 traffic. The video media itself opens a separate connection, sized for the load, carrying the footage on its own path. One link handles the control, another the media, so a heavy stream never blocks the small messages that keep the vehicle reporting its position and alarms. The split lets the video ride without disturbing the groundwork underneath it.
The separate media path is opened only when video is wanted. As a vehicle reports position over the base link, no video connection need exist. When the platform calls for a stream, or a trigger starts a recording upload, the terminal opens the media channel, sends the footage, then closes the channel when the transfer is done. The control link stays up throughout, carrying the small messages; the heavy media path comes and goes with the video it serves.
The video the layer carries comes in two forms. A live stream sends what a camera sees now, for an operator watching a vehicle in real time. A playback pulls a stored recording from the terminal’s card, for reviewing something after the fact. JT 1078 defines both, the commands to open a live view and the commands to find and replay a recording. The one media path serves each in turn, live or recorded, as the platform asks for it.
The two-link arrangement keeps each kind of traffic to a path that suits it. The control messages need a steady, reliable link, since a lost command means a stream that will not start. The media needs room more than steadiness, since a few dropped video frames matter less than a blocked control channel. Splitting them lets each have what it needs, the control its reliability on the base link, the media its room on a path of its own.
On the vehicle the two layers live in one terminal. The device registers with the platform and reports its position and alarms over JT 808, the base layer running continuously as the vehicle moves. When the platform asks for video, or a rule says to record, the same terminal serves the footage over JT 1078, the video layer working through the connection the base layer holds open. The position flows steadily; the video comes and goes as it is called for.
A buyer sees this as one box doing both jobs, since the layers are not separate devices. A vehicle terminal that carries video is a JT 808 terminal with the JT 1078 layer added, the positioning and the video running as parts of one unit on one link to one platform. The terminal is the place the stack comes together, the base and the extension built into a single device the way they are built into a single standard family.
Inside the terminal the two layers share the hardware as well as the link. The same processor that handles the JT 808 reporting handles the JT 1078 video, the same mobile module carries both, the same storage that logs the position data holds the recorded footage. A video terminal is a positioning terminal with the camera handling and the storage to match, built as one device. Nothing about the video stands apart from the box that does the positioning.
A vehicle terminal handles several cameras at once. A typical fit puts a camera on the road ahead, one in the cab, one at a door or in the cargo space, each a channel the JT 1078 layer carries. The terminal tags every channel so the platform knows which view it is watching. The base layer meanwhile reports the one position for the whole vehicle, since location belongs to the vehicle, the several views to its cameras. One terminal carries the single position and the many video channels together.
The platform at the other end has to speak both layers too. It takes the JT 808 position and alarms from its vehicles and holds the JT 1078 video channels open for the ones it watches, managing the base and the extension together for the whole fleet. A platform built only for JT 808 tracks vehicles but cannot call up their video; a full platform handles both as one system.
A platform that runs the full stack is built in matching layers. One part takes the JT 808 stream from every vehicle, plotting positions and logging alarms across the fleet. Another part handles the JT 1078 media, opening video channels and serving footage to the operators watching. The two parts work the one connection each vehicle holds, the position always flowing, the video drawn on demand. A platform missing the video part still tracks the fleet, lacking only what the cameras see.
Across a fleet the platform carries the base traffic from every vehicle at once. Thousands of terminals report position continuously, a steady flow of small messages the platform takes in and plots. The video is the exception drawn now and then, a handful of streams open at any moment against the constant base reporting. This is the shape the stack gives a platform: a wide, steady floor of position data, with video rising from it only where and when a vehicle is watched.
What an operator sees on the platform reflects the stack beneath it. A map shows every vehicle from the JT 808 reports, the fleet laid out by position. A click on one vehicle opens its JT 1078 video, the cameras of that vehicle brought up on demand. The operator works the base view across the fleet and the video view one vehicle at a time, the two layers presenting as one screen with position underneath and video a click away.
Above a single platform, the linking of platforms uses a different protocol again, JT 809, which passes data from one platform up to another. An operator’s platform reports to a regulator’s through JT 809, carrying the vehicle data up the chain. JT 809 sits above the vehicle-to-platform stack, never inside it, the piece that joins platforms where JT 808 and JT 1078 join a vehicle to its platform. A buyer meets it mainly when platforms have to share data, less when specifying a terminal.
For a fleet specifying a terminal, the stack resolves to a plain point: a video-capable vehicle device has to carry the whole of it, JT 808 underneath and JT 1078 on top. A terminal that names only JT 808 does position and alarms with no video. A device that claims JT 1078 rests on a JT 808 base, so a vague spec that mentions the video layer without the foundation is a spec to question. The two belong together, the extension meaningless without the base it is built on. The practical check is to confirm a terminal speaks the full stack and reports to a platform that does the same. A complete vehicle device handles JT 808 for position and alarms and JT 1078 for video, on one link to a platform that manages both. Read a spec that offers video without a solid positioning base, or position with no path to video on a vehicle meant to have cameras, as a sign the stack is incomplete. A fleet that fits the full stack gets a vehicle whose position, alarms and video all reach the platform it has to report to, the foundation and the floor above it standing as one. The check a buyer makes follows the layers. Confirm the terminal carries JT 808 for the position and alarms every commercial vehicle has to report. Confirm it carries JT 1078 for the video if the vehicle has cameras. Confirm the platform it reports to handles both, since a terminal speaking the full stack to a platform that hears only half wastes what it sends. The full stack on both ends is what a complete fitting needs.
JT 808 is the base protocol that carries a commercial vehicle’s position, alarms and basic communication with its platform. JT 1078 is a video extension built on top of JT 808, adding camera channels, live streaming, playback and two-way audio. JT 808 is the foundation; JT 1078 is the video layer raised on it. A video-capable vehicle runs both at once.
No. JT 1078 does not replace JT 808; it builds on it. A vehicle running video still registers, authenticates and reports its position and alarms through JT 808, using JT 1078 only for the video part. The two run as one stack, the extension leaning on the base at every point. Without the base, the extension loses registration, identity and the link it rides; the stack works only as a whole.
Because video brought new kinds of alarm, such as a blocked camera or a storage fault, that the original 32-bit field had no room to hold. JT 1078 widened the field to 64 bits, keeping the lower bits as JT 808 defined them and using the upper bits for video alarms. An older platform still reads the alarms it knows; a full platform reads the new ones too.
Because a video stream is far heavier than a position report or an alarm flag. JT 1078 keeps the video signalling on the established JT 808 link, with the video media itself on a separate connection, sized for the load. This keeps a heavy stream from blocking the small messages that report the vehicle’s position and alarms.
Yes. A JT 808 terminal handles position, alarms and communication with no video, which suits a vehicle that needs tracking without cameras. Adding JT 1078 adds the video layer on top. A video-capable terminal is a JT 808 device with JT 1078 built in, so it always carries the base even when the buyer is focused on the video.
JT 809 links platforms to each other, on its own current text, JT/T 809-2019; the vehicle-to-platform link is 808’s own. It passes vehicle data from one platform up to another, such as from an operator’s platform to a regulator’s. JT 808 and JT 1078 run between a vehicle and its platform; JT 809 sits above them, joining platforms up the chain. A terminal buyer meets it mainly when platforms have to exchange data.