Can't load speedspeech onnx file #1263

xd009642 · 2023-11-17T17:55:28Z

Uploaded a zip of the model

Error: Failed analyse for node #1250 "/Pow" Pow

Caused by:
0: Infering facts
1: Applying rule outputs[0].shape == ?
2: Unifying shapes and ?
3: Impossible to unify closed shapes of different rank (found and ?).

The code in question (as an aside wondering how to tell what the generics should be):

use super::*;
use tract_onnx::prelude::*;
use ndarray::Array2;
use std::path::Path;


pub struct SpeedyTract {
   // model: RunnableModel<F, O, M>
}

impl SpeedyTract {
    #[must_use]
    pub fn load(path: impl AsRef<Path>) -> anyhow::Result<Self> {
        let model = tract_onnx::onnx()
            .model_for_path(path)?
            .into_optimized()?
            .into_runnable()?;

        todo!()
 //       Ok(Self{ model })
    }
}

xd009642 · 2023-11-17T18:21:02Z

Okay on further look this disappears if I remove into_optimized() call. So removing that to try and continue to play with it.

xd009642 · 2023-11-17T19:20:18Z

Latest code that fails with:

Error: Evaluating #1195 "/SequenceEmpty" Unimplemented(SequenceEmpty)

Caused by:
stateless evaluation not implemented

Still having a play around, I will say the level of ONNX support is impressive, all the other rust solutions I've tried have failed much much earlier and with less/no actionable logs!

use super::*;
use anyhow::Context;
use tract_onnx::prelude::*;
use tract_onnx::tract_hir::infer::InferenceOp;
use ndarray::Array2;
use std::path::Path;


pub struct SpeedyTract {
    model: SimplePlan<InferenceFact, Box<dyn InferenceOp>, Graph<InferenceFact, Box<dyn InferenceOp>>>,
    phoneme_ids: Vec<Unit>,
}


impl SpeedyTract {
    #[must_use]
    pub fn load(path: impl AsRef<Path>) -> anyhow::Result<Self> {
        let model = tract_onnx::onnx()
            .model_for_path(path)
            .context("loading ONNX file")?
        // https://github.com/sonos/tract/issues/1263
        //    .into_optimized()
        //    .context("optimising graph")?
            .into_runnable()
            .context("converting to runnable model")?;

        Ok(Self{ 
            model,
            phoneme_ids: generate_id_list(),
        })
    }
    
    pub fn infer(&self, units: &[Unit]) -> anyhow::Result<Array2<f32>> {
        let phonemes = units
            .iter()
            .map(|x| best_match_for_unit(x, &self.phoneme_ids))
            .collect::<Vec<_>>(); // This is a Vec<i64>

        let tensor = Tensor::from_shape(&[1, units.len()], &phonemes)?;
        let plen = Tensor::from(units.len() as i64);

        let result = self.model.run(tvec!(tensor.into(), plen.into()))?;

        tracing::info!("Result: {:?}", result);

        todo!()
    }
}

kali · 2023-11-17T19:23:35Z

Thanks for the kind words, but Sequences (and Maps) are not supported, and are very not on the roadmap. Is there any chance you model could be refactored without sequences ?

xd009642 · 2023-11-17T19:27:48Z

I don't think so tbh, it's a TTS model and as such works on variable input lengths. I wouldn't mind looking into implementing sequences if a PR would be accepted - but naturally I'm new to the code and internals so that might not feasible without at least pointing in the general direct.

kali · 2023-11-18T09:58:23Z

Sequences in tract would be a massively epic overhaul. tract "variables" are Tensors of known fixed rank and "symbolic dimensions". Changing this is huge and would probably have long-term impact on code complexity, maintainability and performance. So don't start hacking tensors sequences, you would most likely drown yourself in it or I would probably have to reject the PR. Let's look a other options first.

You may be aware of it, but tract main application is actually voice processing and we manage to do everything we need without tensor sequences, including dealing with variable lengths and/or "infinite" inputs. Recurring networks is the traditional way, but tract state management, network pulsification and symbolic dimension management gave us the flexibility we needed to.

The only kind of generalization I think tensor sequence could bring to the table would be to represent a time-based sequence of tensors having a varying dimension on a non-time axis. This is super exotic. I have never been shown such a design yet.

OK, so what can we do ? I had a quick look at the network, most of it look fine, then there is a Loop that takes an empty sequence as input, push stuff in there and then the sequence is made into a tensor again. Well, bad news, the Loop is not supported either. tract has only support for Scan (supporting Loop is actually an ongoing background relatively low-priority task).

Scan does a bit of what the Loop plus Sequence seems to do here: builds an output tensor by concatenating chunks of data together. The main restriction between the scan and the loop is the scan performs a fixed number of iteration determined by the time dimension of its input (which can be symbolic, it only has to be determined when the Scan starts). The loop, on the other hand can stop the iteration based on a runtime condition computed as part of the loop body itself. That is not something that can be done with scan.

How familiar are you with this model design ? Am I making sense here ?

xd009642 · 2023-11-18T10:14:32Z

Yeah that all makes sense thanks. I'm more aware of the model design from the details in the paper, not sure how well that maps to the actual implementation. From the phonemes going in it generates a number of frames duration for each phoneme and then for phoneme + duration it will generate the necessary spectrogram frames.

I was going to look at using the torch tracing to generate a model with a longer input context than I need, but I'm a bit concerned that the loop is for the phoneme durations and therefore dynamic and it might not work as I hope if I pick my dummy input for tracing wrong.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't load speedspeech onnx file #1263

Can't load speedspeech onnx file #1263

xd009642 commented Nov 17, 2023

xd009642 commented Nov 17, 2023

xd009642 commented Nov 17, 2023

kali commented Nov 17, 2023

xd009642 commented Nov 17, 2023

kali commented Nov 18, 2023 •

edited

Loading

xd009642 commented Nov 18, 2023

Can't load speedspeech onnx file #1263

Can't load speedspeech onnx file #1263

Comments

xd009642 commented Nov 17, 2023

xd009642 commented Nov 17, 2023

xd009642 commented Nov 17, 2023

kali commented Nov 17, 2023

xd009642 commented Nov 17, 2023

kali commented Nov 18, 2023 • edited Loading

xd009642 commented Nov 18, 2023

kali commented Nov 18, 2023 •

edited

Loading