Category: Reddit MachineLearning

This fascinating 1985 PC Magazine cover story on expert systems will help you appreciate just how far we’ve come [N]

Written on November 11, 2019. Posted in Reddit MachineLearning.

Back in 1985, PC Magazine dedicated an issue to expert systems, which were an early form of machine learning. Thanks to the Internet Archive, you can see the whole issue here. If you’re not familiar with expert systems, Wikipedia has you covered. I’m especially impressed by the review of Expert-Ease, which reads a lot like some of the user-friendly decision support tools you see emerging these days.

submitted by /u/TrueBirch
[link] [comments]

[P] Importing Pyspark PipelineModel with custom transformers into Scala

Written on November 11, 2019. Posted in Reddit MachineLearning.

I recently created a PipelineModel with a few custom transformers to generate features not doable with the native Spark transformers. Here’s an example of one of my transformers:

class newLabelMap( Transformer, HasInputCol, HasOutputCol, DefaultParamsReadable, DefaultParamsWritable, ): inputCol = Param(Params._dummy(),"inputCol","The input column",TypeConverters.toString) outputCol = Param(Params._dummy(),"outputCol","The output column",TypeConverters.toString) def __init__(self, inputCol = "", outputCol=""): super(newLabelMap, self).__init__() self._setDefault(inputCol="") self._setDefault(outputCol="") self._set(inputCol=inputCol) self._set(outputCol=outputCol) def getInputCol(self): return self.getOrDefault(self.inputCol) def setInputCol(self, inputCol): self._set(inputCol=inputCol) def getOutputCol(self): return self.getOrDefault(self.outputCol) def setOutputCol(self, outputCol): self._set(outputCol=outputCol) def _transform(self, dataset): @udf("string") def findLabel(labelVal): new_label_dict = {'oldLabel0' : 'newLabel0', 'oldLabel1' : 'newLabel1', 'oldLabel2' : 'newLabel1', 'oldLabel3' : 'newLabel1', 'oldLabel4' : 'newLabel2', 'oldLabel5' : 'newLabel2', 'oldLabel6' : 'newLabel2', 'oldLabel7' : 'newLabel3', 'oldLabel8' : 'newLabel3', 'oldLabel9' : 'newLabel4', 'oldLabel10' : 'newLabel4'} try: labelKey = new_label_dict[labelVal] return labelKey except: return 'other' out_col = self.getOutputCol() in_col = dataset[self.getInputCol()] return dataset.withColumn(out_col, findLabel(in_col))

The transformer works fine in the Pipeline, I can save it, load it back into a pyspark session, and transform. The issue comes when I try to import it into a scala environment. When I try to load the model, I receive this error output:

Name: java.lang.IllegalArgumentException Message: requirement failed: Error loading metadata: Expected class name org.apache.spark.ml.PipelineModel but found class name pyspark.ml.pipeline.PipelineModel StackTrace: at scala.Predef$.require(Predef.scala:224) at org.apache.spark.ml.util.DefaultParamsReader$.parseMetadata(ReadWrite.scala:638) at org.apache.spark.ml.util.DefaultParamsReader$.loadMetadata(ReadWrite.scala:616) at org.apache.spark.ml.Pipeline$SharedReadWrite$.load(Pipeline.scala:267) at org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:348) at org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:342)

If I remove the custom transformer, it loads just fine in Scala, so I’m curious how to be able to use custom transformers written in pyspark that can be ported in a PipelineModel to a Scala environment? Do I need to append my code in any way? Any help is greatly appreciated 🙂

submitted by /u/Octosaurus
[link] [comments]

[D] Adversarial Attacks on Obstructed Person Re-identification

Written on November 10, 2019. Posted in Reddit MachineLearning.

Hi,

I was reading this post about Chinese face recognition papers at ICCV (https://www.reddit.com/r/MachineLearning/comments/dp389c/d_iccv_19_the_state_of_some_ethically/?utm_medium=android_app&utm_source=share) and started wondering whether an adversarial attack on obstructed personal re-identification models would be feasible, how would it be carried out, and whether it would actually make a relevant and useful personal project in the context of today’s power dynamics.

For example, with a GAN approach, one might use the re-identification model as a discriminator and try to train a generator to generate custom overlapping mask patches over one’s face in order to fool the discriminator into misidentifying the person.

What do you guys think?

submitted by /u/paubric
[link] [comments]

[D] Impala vs MCTS for self play

Written on November 10, 2019. Posted in Reddit MachineLearning.

AlphaStar uses Impala over tree search. Comments here explain this is mainly due to action space width. But conceptually, i never grasped by one method makes better use of a given “exploration budget”.

A. Is it just tree width or also the episode length?

B. Someone (maybe Vinyals?) mentioned that “it would be hard to saturate the GPU” with tree search. So if sc2 was a light weight reversible environment, would (a narrow?) tree search become feasible?

(Lets ignore issues such as hidden information, agent league, real time. the building order assistance)

Thank u for any comment

submitted by /u/so_tiredso_tired
[link] [comments]

[P] Playing Snake with Reinforcement Learning

Written on November 10, 2019. Posted in Reddit MachineLearning.

Snake AI trained on reinforcement learning using a DQN, implemented in TensorFlow and Pygame.

Results of 650000 training steps

Github: https://github.com/luckymouse0/Snake-RL

submitted by /u/luckymouse0
[link] [comments]

[D] How to save feature vectors in production server?

Written on November 10, 2019. Posted in Reddit MachineLearning.

I have feature vectors with 2048 elements, also my feature may change overtime e.g. new features are added. I use Faiss as a search engine but I am not quite sure how to save these vectors. Right now I am syncing local folder with AWS s3. I do not think it is optimal because I have to sync files each time I search for similarity which takes a while.

Maybe I should use a vector database (like https://github.com/a-mma/AquilaDB or some other) or is there a more optimized method to sync my local and s3 storage?

submitted by /u/_pydl_
[link] [comments]

[P] Auptimizer – A faster, easier way to do HPO

Written on November 10, 2019. Posted in Reddit MachineLearning.

Hey all, a team I’m part of just open-sourced our internal HPO tool called Auptimizer. Auptimizer does a couple of things. It provides a single interface to 6 different HPO algorithms including Spearmint and HyperOpt. It also makes it easy to scale your model training from CPUs and GPUs all the way to multiple instances on AWS. The repo is on Github. We have an article about it on Medium and you can find more implementation details in our 2019 IEEE Big Data paper. If you do HPO, check it out and let us know on Github how things look.

submitted by /u/YetAnotherAI
[link] [comments]

[D]Why network runs much faster after loading the trained models(parameters)?

Written on November 10, 2019. Posted in Reddit MachineLearning.

I’ve found that before loading trained models, the network runs at relatively low speed. After loading the .pth file, the speed of inference boosts about 10 times faster. Does this circumstance normally come in deep learning?

I’ve tried on SSD(single shot multibox detector) on object detection task of COCO dataset, the code is written in python with pytorch 1.0.

Before loading, I got 2-3 fps on GTX1080, and after that, it reached 20 fps on the same device under same environment.

submitted by /u/AlphaGoMK
[link] [comments]

[R] MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams

Written on November 10, 2019. Posted in Reddit MachineLearning.

MIDAS detects microcluster anomalies from an edge stream in constant time and memory, while providing theoretical guarantees about its false positive probability. Microcluster anomalies are suddenly arriving groups of suspiciously similar edges, such as lockstep behavior and denial of service attacks in network traffic data.

Paper: https://www.comp.nus.edu.sg/~sbhatia/assets/pdf/midas.pdf (Accepted at AAAI 2020)

Code: https://github.com/bhatiasiddharth/MIDAS

Feedback is welcome!

submitted by /u/siddharthb_
[link] [comments]

How would you try to build an AGI if you had truly unlimited compute? [discussion]

Written on November 10, 2019. Posted in Reddit MachineLearning.

If somehow you managed to get your hands on a computer with literally unlimited compute power (through an interstellar alien technology exchange or whatever), and could feed it any/all data currently available online, what would be your approach to creating a “true” AGI? Is there any approach currently out there that might realistically result in an AGI if given enough processing power/data?

I’m currently trying my hand at writing speculative fiction, and was wondering what a realistic approach in such a scenario might look like…

submitted by /u/yitzilitt
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Category: Reddit MachineLearning

This fascinating 1985 PC Magazine cover story on expert systems will help you appreciate just how far we’ve come [N]

[P] Importing Pyspark PipelineModel with custom transformers into Scala

[D] Adversarial Attacks on Obstructed Person Re-identification

[D] Impala vs MCTS for self play

[P] Playing Snake with Reinforcement Learning

[D] How to save feature vectors in production server?

[P] Auptimizer – A faster, easier way to do HPO

[D]Why network runs much faster after loading the trained models(parameters)?

[R] MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams

How would you try to build an AGI if you had truly unlimited compute? [discussion]