Schema needed when uploading AVRO file from local filesystem, but not from GCS.
I'm trying to upload an avro file from my local filesystem using the following code:
random_table_name = make_random_string() table = dataset_.table(name=random_table_name) with open(schema_filepath, 'rb') as file_: schema = avro.schema.parse(file_.read()) with tempfile.NamedTemporaryFile() as temp_file: with open(temp_file.name, 'wb') as file_: writer = avro.datafile.DataFileWriter( file_, avro.io.DatumWriter(), schema) writer.close() with open(temp_file.name, 'rb') as file_: job = table.upload_from_file(file_, source_format='AVRO') wait_for_job_to_finish(job)
I get the following error:
google.cloud.exceptions.BadRequest: 400 Empty schema specified for the load job. Please specify a schema that describes the data being loaded. (https://www.googleapis.com/upload/bigquery/v2/projects/aircraft-audio-classification/jobs?uploadType=multipart)
I don't need to specify a schema when using the web api or when uploading a file from GCS.