Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training multiple buildings #9

Open
anasvaf opened this issue Oct 11, 2018 · 14 comments
Open

Training multiple buildings #9

anasvaf opened this issue Oct 11, 2018 · 14 comments

Comments

@anasvaf
Copy link

anasvaf commented Oct 11, 2018

Hello Odyssea,

I am trying to replicate the results based on the Kelly et al. paper.
Using your code I changed the train building to the following:

train.set_window(start="13-4-2013", end="1-1-2014")
test.set_window(start="1-1-2014", end="30-3-2014")

train_elec = []
test_building = 5
sample_period = 6
meter_key = 'kettle'

# Lists for train_elec, train_meter and train_mains for multiple buildings
train_elec = [train.buildings[i].elec for i in range(1,5)]
train_meter = [train_elec[j].submeters()[meter_key] for j in range(len(train_elec))]
train_mains = [train_elec[k].mains() for k in range(len(train_elec))]

# Test only in one house
test_elec = test.buildings[test_building].elec
test_mains = test_elec.mains()
rnn = RNNLSTMDissaggregate()

and then I call the train_across_buildings

start = time.time()
print("========== TRAIN ============")
epochs = 0
for i in range(3):
    print("CHECKPOINT {}".format(epochs))
    rnn.train_across_buildings(train_mains, train_meter, epochs=5, sample_period=sample_period)
    epochs += 5
    rnn.export_model("UKDALE-RNN-h{}-{}-{}epochs.h5".format(train_building,
                                                        meter_key,

I am creating lists for the 4 houses that I need during the training but I get the following error when I run the RNN for the function "train_across_buildings"

test

Could you give me some hint on how to change the code?

Best,
Tasos

@tisalvadores
Copy link

tisalvadores commented Oct 15, 2018

Hi

Im having the same error, so i went to the train_across_buildings function and played a bit with it.
I found that the error pointed to the same line (117) no matter what was there. At first forcing self.mmax = x and then even leaving a white line there or putting other lines of code there.

This lead me to think that the pandas error is not related with the function, or that part of the function, but for some reason it always points there. Im not certain of anything anyways :(

Let me know if the same happens to you!

@anasvaf
Copy link
Author

anasvaf commented Oct 15, 2018

Hello Tomas,

I have also tried to play with the train across buildings. Here is how I tried to "debug" the function

        if self.mmax == None:
            print(mainchunks)
            for m in mainchunks:
                print(len(m))
                input("wait")

If you look at the Series that are created, you will notice for house 3, if I am not mistaken, you get an empty dataframe, where you cannot calculate the maximum value. I believe that is the core of the error.

error

Let me know if you get something similar.

@anasvaf
Copy link
Author

anasvaf commented Oct 15, 2018

To be more specific, here is what you get when you calculate inside the for loop m.max()
error_2

@tisalvadores
Copy link

Hmm
Im using the REDD database, so i haven't checked, but maybe it's just that the third house of ukdale doesn't have a kettle meter. Check it and let me know, because i used to have that error and now i don't have it anymore and don't know why 😅.

Nonetheless, that's not my case (i don't have an empty dataframe) and i'm still having problems.

I changed the code as u suggested to

if self.mmax == None:
            print(mainchunks)
            for m in mainchunks:
                print(len(m))
                input("wait")
            self.mmax = max([m.max() for m in mainchunks])

and training with three houses i got

23179
wait
31616
wait
56886
wait

For some reason my code is now running a bit past that, it entered into the train_across_buildings_chunk method and died in doing the random.shuffle(batch_indexes).
I fixed that modifying the line above that from batch_indexes = range(min(num_of_batches)) to batch_indexes = list(range(min(num_of_batches)))
But now the reshape X = np.reshape(mainpart, (batch_size, self.window_size, 1)) raises me this error:

Traceback (most recent call last):
  File "redd-test.py", line 39, in <module>
    disaggregator.train_across_buildings(train_mains, train_meters, epochs=1, sample_period=sample_period)
  File "/Users/TSV/Desktop/Progra/IPre/Server/seq2seq/2buildings/shortseq2pointdisaggregator.py", line 151, in train_across_buildings
    self.train_across_buildings_chunk(mainchunks, meterchunks, epochs, batch_size)
  File "/Users/TSV/Desktop/Progra/IPre/Server/seq2seq/2buildings/shortseq2pointdisaggregator.py", line 212, in train_across_buildings_chunk
    X = np.reshape(mainpart, (batch_size, self.window_size, 1))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 279, in reshape
    return _wrapfunc(a, 'reshape', newshape, order=order)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 51, in _wrapfunc
    return getattr(obj, method)(*args, **kwds)
ValueError: cannot reshape array of size 2100 into shape (42,100,1)
Closing remaining open files:/Users/TSV/Desktop/Progra/IPre/data/REDD/redd.h5...done/Users/TSV/Desktop/Progra/IPre/data/REDD/redd.h5...done

Let me know if you get here or if you don't get rid of the pandas error.

@OdysseasKr
Copy link
Owner

Hello everybody and sorry for the late reply. Which dataset are you using? For UKDALE, building 3 there is no data for the specified date range

train.set_window(start="13-4-2013", end="1-1-2014")

This may explain the empty dataframe.

@tisalvadores
Copy link

Hi Ody!
First of all thanks a lot for sharing your work and maintaining support! It's been really helpful 😃
I'm using REDD and don't have the empty dataframe problem, but another error as i explained above.
Does the train_across_buildings method as it is in the repo work fine for you?

@anasvaf
Copy link
Author

anasvaf commented Oct 16, 2018

Hello guys,
I managed to fix the error by simply removing the train and test.set_window. I assume that the toolkit can identify the common dates within all the buildings. I am using the UK-DALE dataset for my experiments.

I print the batch 0 array for my 4 training houses and it is the following:
Batch 0 of [1185228, 317437, 70848, 366821]

The error that I get know is that my data has to be 1-D.
error

Any hints for this one?

@anasvaf
Copy link
Author

anasvaf commented Oct 16, 2018

I managed to fix it by changing the for loop inside the train_across_buildings_chunk as follows:

                # Create a batch out of data from all buildings
                for i in range(num_meters):
                    mainpart = mainchunks[i]
                    meterpart = meterchunks[i]
                    mainpart = mainpart[b*batch_size:(b+1)*batch_size]
                    meterpart = meterpart[b*batch_size:(b+1)*batch_size]
                    X = np.reshape(mainpart.values, (batch_size, 1, 1))
                    Y = np.reshape(meterpart.values, (batch_size, 1))

                    X_batch[i*batch_size:(i+1)*batch_size] = np.array(X)
                    Y_batch[i*batch_size:(i+1)*batch_size] = np.array(Y)

from the pandas Series, we needed only the values in order to reshape.

Tomas if you just change the X = np.reshape(mainpart, (batch_size, self.window_size, 1))
to X = np.reshape(mainpart.values, (batch_size, self.window_size, 1)) the code should work.

Now the script is iterating over the batches. I will let you know regarding the progress.

@OdysseasKr
Copy link
Owner

OdysseasKr commented Oct 17, 2018

I managed to fix the error by simply removing the train and test.set_window. I assume that the toolkit can identify the common dates within all the buildings. I am using the UK-DALE dataset for my experiments.

I am not sure whether the toolkit detects common sections within building. Also by removing the limit, you are now getting all of the data available for each building.

@OdysseasKr
Copy link
Owner

@TomasSalvadores Your problem seems different than the one mentioned by the OP, please open a new issue and describe your problem in order to be able to discuss it.

@anasvaf
Copy link
Author

anasvaf commented Oct 17, 2018

I managed to fix the error by simply removing the train and test.set_window. I assume that the toolkit can identify the common dates within all the buildings. I am using the UK-DALE dataset for my experiments.

I am not sure whether the toolkit detects common sections within building. Also by removing the limit, you are now getting all of the data available for each building.

You are right. I am just using all of the data available for each building. Sorry for any confusion :)

@maechler
Copy link
Contributor

@anasvaf Thanks, you saved my day! Although I think there is an error in your code, I had to change
Y = np.reshape(meterpart.values, (batch_size, 1))
to
Y = np.reshape(meterpart.values, (batch_size, 1, 1))
in order to make it work.

@bundit786
Copy link

Hi @OdysseasKr, @anasvaf, @maechler, @TomasSalvadores
I have experimented using dae.train_across_buildings and got the error as well. I tested with UK-DALE dataset and tried to learn fridge model from House 1 and House 2. I tried fixing follow the recommendations by both @anasvaf and @maechler but they still didn't work. Below was the error (the same for both methods).

========== TRAIN ============
CHECKPOINT 0
0
Batch 0 of [25, 25]


AttributeError Traceback (most recent call last)
~\Anaconda3\envs\nilmtk-env\lib\site-packages\numpy\core\fromnumeric.py in _wrapfunc(obj, method, *args, **kwds)
55 try:
---> 56 return getattr(obj, method)(*args, **kwds)
57

~\Anaconda3\envs\nilmtk-env\lib\site-packages\pandas\core\generic.py in getattr(self, name)
5066 return self[name]
-> 5067 return object.getattribute(self, name)
5068

AttributeError: 'Series' object has no attribute 'reshape'

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in
39 print("CHECKPOINT {}".format(epochs))
40
---> 41 dae.train_across_buildings(train_mains, train_meter, epochs=5, sample_period=sample_period)
42
43 epochs += 5

~\daedisaggregator.py in train_across_buildings(self, mainlist, meterlist, epochs, batch_size, **load_kwargs)
140 meterchunks = [self._normalize(m, self.mmax) for m in meterchunks]
141
--> 142 self.train_across_buildings_chunk(mainchunks, meterchunks, epochs, batch_size)
143 try:
144 for i in range(num_meters):

~\daedisaggregator.py in train_across_buildings_chunk(self, mainchunks, meterchunks, epochs, batch_size)
181 mainpart = mainpart[b*batch_size:(b+1)batch_size]
182 meterpart = meterpart[b
batch_size:(b+1)*batch_size]
--> 183 X = np.reshape(mainpart.values, (batch_size, self.window_size, 1))
184 Y = np.reshape(meterpart.values, (batch_size, 1))
185

~\Anaconda3\envs\nilmtk-env\lib\site-packages\numpy\core\fromnumeric.py in reshape(a, newshape, order)
290 [5, 6]])
291 """
--> 292 return _wrapfunc(a, 'reshape', newshape, order=order)
293
294

~\Anaconda3\envs\nilmtk-env\lib\site-packages\numpy\core\fromnumeric.py in _wrapfunc(obj, method, *args, **kwds)
64 # a downstream library like 'pandas'.
65 except (AttributeError, TypeError):
---> 66 return _wrapit(obj, method, *args, **kwds)
67
68

~\Anaconda3\envs\nilmtk-env\lib\site-packages\numpy\core\fromnumeric.py in _wrapit(obj, method, *args, **kwds)
48 if not isinstance(result, mu.ndarray):
49 result = asarray(result)
---> 50 result = wrap(result)
51 return result
52

~\Anaconda3\envs\nilmtk-env\lib\site-packages\pandas\core\series.py in array_wrap(self, result, context)
733 """
734 return self._constructor(result, index=self.index,
--> 735 copy=False).finalize(self)
736
737 def array_prepare(self, result, context=None):

~\Anaconda3\envs\nilmtk-env\lib\site-packages\pandas\core\series.py in init(self, data, index, dtype, name, copy, fastpath)
247 'Length of passed values is {val}, '
248 'index implies {ind}'
--> 249 .format(val=len(data), ind=len(index)))
250 except TypeError:
251 pass

ValueError: Length of passed values is 64, index implies 32768

Please you guys suggest how to fix the problem.

Best,
Bundit

@pa8anas
Copy link

pa8anas commented Oct 26, 2022

Hi Ody! First of all thanks a lot for sharing your work and maintaining support! It's been really helpful 😃 I'm using REDD and don't have the empty dataframe problem, but another error as i explained above. Does the train_across_buildings method as it is in the repo work fine for you?

where can i download redd dataset?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants