Nvidia and major telcos are launching distributed "AI grids" to run inference at the network edge, reducing latency and costs for real-time AI applications.